Adopt a Dataset: Classroom Edition

Exercise design and goal

This exercise is designed to get students familiar with a single study within ICPSR’s catalog. It would be especially useful in a course in which students have been introduced to research-related concepts such as sampling, data collection, or hypotheses and variables. The types of questions asked are the kinds of things one should keep in mind when exploring and evaluating data for secondary analysis.

Finding a dataset

You can leave the “adoption” wide open – have your students go to https://www.icpsr.umich.edu and put in search terms of interest to them, or narrow the list based on course topics or the list of “adoptable data” ICPSR puts out annually for Love Data Week (included in the “Activity: Adopt a Dataset”).

The activity relies on the structured information provided for data in ICPSR’s curated collection, so if having students do searches, it is recommended to remove self-published data from the results using “-openicpsr.” 

Once students have found a dataset

Congratulations on the adoption of your dataset! Now let’s explore and get to know more about those data.

Answer the following questions, using information that can be found on the study’s homepage, to help you gain a better understanding of what and who is in the data, what can be done with the data, and what makes these data special:

  1. Which study/dataset did you adopt? Include the title, alternative title (if applicable) and ICPSR study number that is found in parentheses after the title.
  2. Is this study part of a larger series of studies?
    1. If so, what is the series? How many studies are in the series?

For the next series of questions, use the tab sections (e.g., “At a Glance” or “Data & Documentation) other than “Project Description” to find the information.

Using the “At a Glance” tab:

  1. Briefly describe, in your own words, the purpose of the study.
  2. Are these data from a survey or some other source? If “other,” what was the source?
  3. How were these data collected or compiled (for example, if it’s a survey, was it done online, by phone, paper-and-pencil, or face to face)? By whom? When?
  4. Is the study about individual people, organizations, or something else (and what)?
  5. Who or what is the study meant to represent – that is, what is the universe? Think about geographic locations as well as any specific subpopulations such as age or racial categories.
  6. How would you cite the study if you were to use the data in a project?

Next, look at the “Data & Documentation” tab:

  1. How many datasets are included in the study? Are any of them restricted?
  2. In what file formats are the public-use (i.e., not restricted) data available?

Using the “Variables” tab or “Explore Data” button:

If neither is an option, you’ll need to download the codebook for one of the datasets. This can be done by clicking the dropdown menu under “Download” next to the dataset of your choice and selecting “Codebook” or “ICPSR Codebook.”

  1. Explore the variables and find some that look interesting to you. Choose at least two and list the
    1. Variable name
    2. Variable label (if there is one)
    3. Question text (if present)
    4. Most common answer for that variable (note, some variables don’t have a single “common answer” – in that case, describe what you see in the frequency table or summary statistics).
  2. Given what you learned about the project from the summary, were there variables you didn’t expect to find or, conversely, things you were surprised weren’t there? Describe anything unexpected.

Finally, use the “Data-related Publications” tab to answer the following:

  1. How many publications have been linked to this study?
  2. What are some of the topics these data have been used to study?
  3. Briefly describe the key findings from at least one related publication.
  4. Do the publications seem to be written by a variety of researchers, or mainly those associated with the original project?

Now, think about all of the information you have just learned from the perspective of a researcher interested in the data.

  1. Describe any characteristics that make these data unique – such as following the same people over time, including a group of people or organizations that seem understudied, or asking a detailed set of questions about a particular topic or questions about a wide range of topics, etc.
  2. Think of a hypothesis that could be tested using these data and share that here:
  3. Identify any potential limitations of the study a researcher would want to be aware of before using the data. For example, is the number of cases or variables very small, the data collected a long time ago, or key questions not included?
  4. What else, if anything, would you like to know about the data to determine whether they’d be useful for a research project?

The text and questions in this activity can be used as a whole or modified to meet your classroom needs! This activity is covered by a Creative Commons CC BY-NC (attribution, non-commercial) license.