Chapter 13 Secondary Data Analysis and Content Analysis

  • Slides: 41
Download presentation
Chapter 13 Secondary Data Analysis and Content Analysis

Chapter 13 Secondary Data Analysis and Content Analysis

Introduction n Secondary data analysis is the method of using preexisting data in a

Introduction n Secondary data analysis is the method of using preexisting data in a different way or to answer a different research question than intended by those who collected the data. The most common sources of secondary data— previously collected data that are used in a new analysis—are social science surveys and data collected by government agencies, often with survey research methods. It is also possible to reanalyze data that have been collected in experimental studies or with qualitative methods.

Introduction, cont. n n Even reanalysis by a researcher of data that he collected

Introduction, cont. n n Even reanalysis by a researcher of data that he collected previously qualifies as secondary analysis if it is for a new purpose or in response to a methodological critique. Thanks to the data collected by social researchers, governments, and organizations over many years, secondary data analysis has become the research method used by many contemporary social scientists to investigate important research questions.

Why Consider Secondary Data? n n Data collected in previous investigations is available for

Why Consider Secondary Data? n n Data collected in previous investigations is available for use by other social researchers on a wide range of topics. Available datasets often include many more measures and cases and reflect more rigorous research procedures than another researcher will have the time or resources to obtain in a new investigation.

Why Consider Secondary Data? cont. n n Much of the groundwork involved in creating

Why Consider Secondary Data? cont. n n Much of the groundwork involved in creating and testing measures with the dataset has already been done. Most important, most funded social science research projects collect data that can be used to investigate new research questions that the primary researchers who collected the data did not consider.

Why Consider Secondary Data? cont. n n n Content analysis is similar to secondary

Why Consider Secondary Data? cont. n n n Content analysis is similar to secondary data analysis in its use of information that has already been collected. Therefore, like secondary data analysis, content analysis can be called an “unobtrusive method” that does not need to involve interacting with live people. In addition, most content analyses, like most secondary data analyses, use quantitative analysis procedures and you will find some datasets resulting from content analyses in collections of secondary datasets.

Why Consider Secondary Data? cont. n n n Content analyses can even be used

Why Consider Secondary Data? cont. n n n Content analyses can even be used to code data collected in surveys, so you can find content analysis data included in some survey datasets. However, content analysis methods usually begin with text, speech broadcasts, or visual images, not data already collected by social scientists. The content analyst develops procedures for coding various aspects of the textual, oral(spoken), or visual material and then analyzes this coded “content. ”

Secondary Data Sources n n With the advent of modern computers and, even more

Secondary Data Sources n n With the advent of modern computers and, even more important, the Internet, secondary data analysis has become an increasingly accessible social research method. Literally thousands of large-scale datasets are now available for the secondary data analyst.

Secondary Data Sources, cont. n n There are many sources of data for secondary

Secondary Data Sources, cont. n n There are many sources of data for secondary analysis within the United States and internationally. These sources range from data compiled by governmental units and private organizations for administrative purposes, which are subsequently made available for research purposes, to data collected by social researchers for one purpose that are then made available for reanalysis.

Secondary Data Sources, cont. n n What makes secondary data analysis such an exciting

Secondary Data Sources, cont. n n What makes secondary data analysis such an exciting and growing option today are the considerable resources being devoted to expanding the amount of secondary data and to making it available to social scientists. For example, the National Data Program for the Social Sciences, funded in part by the National Science Foundation, sponsors the ongoing GSS (General Social Survey) in order to make current data on a wide range of research questions available to social scientists.

U. S. Bureau of the Census n n The U. S. government has conducted

U. S. Bureau of the Census n n The U. S. government has conducted a census of the population every 10 years since 1790; since 1940, this census also has included a census of housing. The Census Bureau’s monthly Current Population Survey (CPS) provides basic data on labor force activity that is then used in U. S. Bureau of Labor Statistics reports.

Integrated Public Use Microdata Series n n Individual-level samples from U. S. Census data

Integrated Public Use Microdata Series n n Individual-level samples from U. S. Census data for the years 1850 to 1990, as well as historical census files from several other countries, are available through the Integrated Public Use Microdata Series (IPUMS) at the University of Minnesota’s Minnesota Population Center (MPC). These data are prepared in an easy-to-use format that provides consistent codes and names for all the different samples.

Bureau of Labor Statistics (BLS) n n The U. S. Department of Labor, which

Bureau of Labor Statistics (BLS) n n The U. S. Department of Labor, which collects and analyzes data on employment, earnings, prices, living conditions, industrial relations, productivity and technology, and occupational safety and health (U. S. Bureau of Labor Statistics 1991, 1997 b). The monthly Current Population Survey (CPS) provides a monthly employment and unemployment record for the United States, classified by age, sex, race, and other characteristics.

Other U. S. Government Sources n n Many more datasets useful for historical and

Other U. S. Government Sources n n Many more datasets useful for historical and comparative research have been collected by federal agencies and other organizations. The National Technical Information Service (NTIS) of the U. S. Department of Commerce maintains a Federal Computer Products Center that collects and catalogs many of these datasets and related software.

Independent Investigator Data Sources n n Many researchers who have received funding to investigate

Independent Investigator Data Sources n n Many researchers who have received funding to investigate a wide range of research topics make their data available on websites where they can be downloaded by other researchers for secondary data analyses. One of the largest, introduced earlier, is the Add Health study, funded at the University of North Carolina by the National Institute of Child Health and Human Development (NICHD) and 23 other agencies and foundations to investigate influences on adolescents’ health and risk behaviors (www. cpc. unc. edu/projects/addhealth).

Independent Investigator Data Sources, cont. n Another significant data source, the Health and Retirement

Independent Investigator Data Sources, cont. n Another significant data source, the Health and Retirement Study (HRS), began in 1992 with funding from the National Institute on Aging (NIA) (http: //hrsonline. isr. umich. edu/). n n To investigate family experience change, researchers at the University of Wisconsin designed the National Survey of Families and Households (www. ssc. wisc. edu/nsfh/). Other noteworthy examples, among many, are the Detroit Area Studies, with annual surveys between 1951 and 2004 on a wide range of personal, political, and social issues (www. icpsr. umich. edu/icpsrweb/detroitareastudies/).

ICPSR n n n The University of Michigan’s ICPSR is the premier source of

ICPSR n n n The University of Michigan’s ICPSR is the premier source of secondary data useful to social science researchers. ICPSR was founded in 1962 and now includes more than 325 colleges and universities in North America and hundreds of institutions on other continents. ICPSR archives the most extensive collection of social science datasets in the United States outside of the federal government.

ICPSR, cont. n n n ICPSR also catalogs reports and publications containing analyses that

ICPSR, cont. n n n ICPSR also catalogs reports and publications containing analyses that have used ICPSR datasets since 1962—more than 34, 000 citations were in this archive in July 2005. This superb resource provides an excellent starting point for the literature search that should precede a secondary data analysis. In most cases, you can learn from detailed study reports a great deal about the study methodology, including the rate of response in a sample survey and the reliability of any indexes constructed.

ICPSR, cont. n n n Published articles provide not only examples of how others

ICPSR, cont. n n n Published articles provide not only examples of how others have described the study methodology but also research questions that have already been studied with the dataset and issues that remain to be resolved. Even if you are using ICPSR, you shouldn’t stop your review of the literature with the sources listed on the ICPSR site. Conduct a search in Sociological Abstracts or another bibliographic database to learn about related studies that used different databases.

International Data Sources n n n Comparative researchers can find datasets on the population

International Data Sources n n n Comparative researchers can find datasets on the population characteristics, economic and political features, and political events of many nations. Some of these are available from U. S. government agencies. For example, the Social Security Administration reports on the characteristics of social security throughout the world (Wheeler 1995).

Qualitative Data Sources n n n Far fewer qualitative datasets are available for secondary

Qualitative Data Sources n n n Far fewer qualitative datasets are available for secondary analysis. By far the richest source, if you are interested in cross-cultural research, is the Human Relations Area Files (HRAF) at Yale University. The ICPSR collection includes a limited number of studies containing at least some qualitative data (19 such studies as of July 2005), but these include some very rich data.

Challenges for Secondary Data Analyses n 1. 2. 3. The use of the method

Challenges for Secondary Data Analyses n 1. 2. 3. The use of the method of secondary data analysis has clear advantages for social researchers It can allow analyses of social processes in other inaccessible settings. It saves time and money. It allows the researcher to avoid data collection problems.

Challenges for Secondary Data Analyses, cont. 4. 5. 6. It can facilitate comparison with

Challenges for Secondary Data Analyses, cont. 4. 5. 6. It can facilitate comparison with other samples. It may allow inclusion of many more variables and a more diverse sample than otherwise would be feasible. It may allow data from multiple studies to be combined.

Challenges for Secondary Data Analyses, cont. n n The secondary data analyst also faces

Challenges for Secondary Data Analyses, cont. n n The secondary data analyst also faces some unique challenges. The easy availability of data for secondary analysis should not obscure the fundamental differences between a secondary and a primary analysis of social science data.

Challenges for Secondary Data Analyses, cont. n n n So the greatest challenge faced

Challenges for Secondary Data Analyses, cont. n n n So the greatest challenge faced in secondary data analysis results from the researcher’s inability to design data collection methods that are best suited to answer her research question. The secondary data analyst also cannot test and refine the methods to be used on the basis of preliminary feedback from the population or processes to be studied. Nor is it possible for the secondary data analyst to engage in of making observations, developing concepts, making more observations, and refining the concepts, which is the hallmark of much qualitative methodology.

Challenges for Secondary Data Analyses, cont. n If the primary study was not designed

Challenges for Secondary Data Analyses, cont. n If the primary study was not designed to measure adequately a concept that is critical to the secondary analyst’s hypothesis, the study may have to be abandoned until a more adequate source of data can be found.

Challenges for Secondary Data Analyses, cont. n n n Data quality is always a

Challenges for Secondary Data Analyses, cont. n n n Data quality is always a concern with secondary data, even when the data are collected by an official government agency. Government actions result, at least in part, from political processes that may not have as their first priority the design or maintenance of high-quality data for social scientific analysis. The basis for concern is much greater in research across national boundaries, because different datacollection systems and definitions of key variables may have been used.

Challenges for Secondary Data Analyses, cont. n 1. 2. Any secondary analysis will be

Challenges for Secondary Data Analyses, cont. n 1. 2. Any secondary analysis will be improved if the analyst— yourself or the author of the work that you are reviewing— answers several questions before deciding to develop an analysis of secondary data in the first place and then continues to develop these answers as the analysis proceeds What were the agency’s or researcher’s goals in collecting the data? What data were collected, and what were they intended to measure?

Challenges for Secondary Data Analyses, cont. 3. 4. 5. 6. When was the information

Challenges for Secondary Data Analyses, cont. 3. 4. 5. 6. When was the information collected? What methods were used for data collection? How is the information organized (by date, event, etc. )? What is known about the success of the datacollection effort? How are missing data indicated? What kind of documentation is available? How consistent are the data with data available from other sources?

Challenges for Secondary Data Analyses, cont. n n Answering these questions helps to ensure

Challenges for Secondary Data Analyses, cont. n n Answering these questions helps to ensure that the researcher is familiar with the data he or she will analyze and can help to identify any problems with it. It is unlikely that you or any secondary data analyst will be able to develop complete answers to all of these questions prior to starting an analysis, but it still is critical to make the attempt to assess what you know and don’t know about data quality before deciding whether to conduct the analysis.

Content Analysis n We can learn a great deal about popular culture and many

Content Analysis n We can learn a great deal about popular culture and many other issues through studying the characteristics of messages delivered through the mass media and other sources.

Content Analysis, cont. n n n You can think of a content analysis as

Content Analysis, cont. n n n You can think of a content analysis as a “survey” of some documents or other records of prior communication—a survey with fixed-choice responses that produce quantitative data. This method was first applied to the study of newspaper and film content and then developed systematically for the analysis of Nazi propaganda broadcasts in World War II. Since then, content analysis has been used to study historical documents, records of speeches, and other “voices from the past” as well as media of all sorts (Neuendorf 202: 31– 37).

Content Analysis, cont. n n Content analysis bears some similarities to qualitative data analysis,

Content Analysis, cont. n n Content analysis bears some similarities to qualitative data analysis, because it involves coding and categorizing text and identifying relationships among constructs identified in the text. However, since it usually is conceived as a quantitative procedure, content analysis overlaps with qualitative data analysis only at the margins.

Stages of Content Analysis 1. 2. 3. 4. 5. Identify a population of documents

Stages of Content Analysis 1. 2. 3. 4. 5. Identify a population of documents or other textual sources Determine the units of analysis Select a sample of units from the population Design coding procedures for the variables to be measured Develop appropriate statistical analyses

Ethical Issues in Secondary Data Analysis and Content Analysis n n Analysis of data

Ethical Issues in Secondary Data Analysis and Content Analysis n n Analysis of data collected by others, as well as content analysis of text, does not create the same potential for harm as does the collection of primary data, but neither ethical nor related political considerations can be ignored. Because in most cases the secondary researchers did not collect the data, a key ethical obligation is to cite the original, principal investigators, as well as the data source, such as the ICPSR.

Ethical Issues in Secondary Data Analysis and Content Analysis, cont. n n n Subject

Ethical Issues in Secondary Data Analysis and Content Analysis, cont. n n n Subject confidentiality is a key concern when original records are analyzed. Whenever possible, all information that could identify individuals should be removed from the records to be analyzed so that no link is possible to the identities of living subjects or the living descendants of subjects When you use data that have already been archived, you need to find out what procedures were used to preserve subject confidentiality. The work required to ensure subject confidentiality probably will have been done for you by the data archivist.

Ethical Issues in Secondary Data Analysis and Content Analysis, cont. n n n It

Ethical Issues in Secondary Data Analysis and Content Analysis, cont. n n n It is not up to you to decide whethere any issues of concern regarding human subjects when you acquire a dataset for secondary analysis from a responsible source. The Institutional Review Board (IRB) for the Protection of Human Subjects at your college or university or other institution has the responsibility to decide whether they need to review and approve proposals for secondary data analysis. The federal regulations are not entirely clear on this point, so the acceptable procedures will vary between institutions based on what their IRBs have decided.

n If medical records are included in the data then the IRB must approve

n If medical records are included in the data then the IRB must approve the use of the data.

Ethical Issues in Secondary Data Analysis and Content Analysis, cont. n n n Data

Ethical Issues in Secondary Data Analysis and Content Analysis, cont. n n n Data quality is always a concern with secondary data, even when the data are collected by an official government agency. Researchers who rely on secondary data inevitably make trade-offs between their ability to use a particular dataset and the specific hypotheses they can test. If a concept that is critical to a hypothesis was not measured adequately in a secondary data source, the study might have to be abandoned until a more adequate source of data can be found.

Conclusions n n The easy availability for secondary analyses of datasets collected in thousands

Conclusions n n The easy availability for secondary analyses of datasets collected in thousands of social science investigations is one of the most exciting features of social science research in the 21 st century. You can often find a previously collected dataset that is suitable for testing new hypotheses or exploring new issues of interest.

Conclusions, cont. n n Moreover, the research infrastructure that has developed at ICPSR and

Conclusions, cont. n n Moreover, the research infrastructure that has developed at ICPSR and other research consortia, both in the United States and internationally, ensures that a great many of these datasets have been carefully checked for quality and archived in a form that allows easy access. Many social scientists now review available secondary data before they consider collecting new data with which to investigate a particular research question.