Chapter 8 Elementary Quantitative Data Analysis Learning Objectives
Chapter 8 Elementary Quantitative Data Analysis
Learning Objectives 1. List the options for entering data for quantitative analysis. 2. Identify the types of graphs and statistics that are appropriate for analysis of variables at each level of measurement. 3. List the guidelines for constructing frequency distributions. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 2
Learning Objectives 4. Discuss the advantages and disadvantages of using each of the three measures of central tendency. 5. Define the concept of skewness, and explain how it can influence measures of central tendency. 6. Explain how to percentage a crosstabulation table and how cross-tabulation can be used. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 3
Learning Objectives 7. Discuss the reasons for conducting an elaboration analysis. 8. Know how to obtain secondary data. 9. Understand the concept and concerns in analyzing “Big Data. ” 10. Be aware of ethical guidelines for statistical analyses. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 4
Why Do Statistics? • Quantitative data analysis • Statistic – Descriptive statistics – Inferential statistics Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 5
Why Do Statistics? Case Study: The Likelihood of Voting • Draws data from the 2016 General Social Survey on voting. • Prior research supports hypothesis that likelihood of voting increases with social status. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 6
How to Prepare Data for Analysis • Secondary data analysis • Data cleaning – Most research organizations use a database management program. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 7
What Are the Options for Displaying Distributions? • To describe the shape of distribution, consider: – Central tendency – Variability – Skewness • Can be positive or negative Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 8
What Are the Options for Displaying Distributions? Graphs • Types of graphs: – Bar chart (good for nominal variables) – Histogram – Frequency polygon • Graphs can be drawn to be misleading. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 9
What Are the Options for Displaying Distributions? Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 10
What Are the Options for Displaying Distributions? Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 11
What Are the Options for Displaying Distributions? Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 12
What Are the Options for Displaying Distributions? Frequency Distribution • Percentage • Base number (N) • When presenting distribution of variables with many values, values must be grouped – Categories must be logically defensible. – Categories must be mutually exclusive and exhaustive. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 13
What Are the Options for Summarizing Distributions? • Summary statistics. . . – Describe features of a distribution – Facilitate comparison among distributions • Using just one number to represent a distribution loses information about distribution’s shape. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 14
What Are the Options for Summarizing Distributions? Measures of Central Tendency • Mode – Also called the probability average – Can give misleading impression of central tendency. • Bimodal – Only measure used with nominal variables. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 15
What Are the Options for Summarizing Distributions? Measures of Central Tendency • Median – If it falls between two cases, median is the average of the two middle values. – Not appropriate for nominal variables. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 16
What Are the Options for Summarizing Distributions? Measures of Central Tendency • Mean – Only makes sense if cases can be treated as quantities. – Must assume that an ordinal measure can be treated as an interval. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 17
What Are the Options for Summarizing Distributions? Measures of Central Tendency • Median or Mean? – Mean is pulled in direction of exceptionally high or low values. – If two measures have markedly different values, median is usually preferred. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 18
What Are the Options for Summarizing Distributions? Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 19
What Are the Options for Summarizing Distributions? Measures of Variation • Range – Can be altered drastically by outliers • Interquartile range – Quartile Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 20
What Are the Options for Summarizing Distributions? Measures of Variation • Variance – Mainly used for computing standard deviation • Standard Deviation – Preferred measure of variability – Best for normally distributed variables – Tells you range in which most cases will fall Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 21
What Are the Options for Summarizing Distributions? Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 22
How Can We Tell Whether Two Variables Are Related? • Cross-tabulation (crosstab) – Presented first with frequencies and then with percentages – Termed a bivariate distribution Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 23
How Can We Tell Whether Two Variables Are Related? Reading the Table • Must calculate percentages within levels of independent variable. • Measure of association – Gamma – Chi-square • Statistical significance Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 24
How Can We Tell Whether Two Variables Are Related? Controlling for a Third Variable • Extraneous variable • Elaboration analysis – Represented in a trivariate table – Check for spurious relationships • More sophisticated analyses: regression, correlation analysis Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 25
Secondary Data Analysis • Secondary data analysis • Popular because. . . – Groundwork has already been done – Available data sets are usually large – Can supplement with additional data as needed Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 26
Secondary Data Analysis Inter-university Consortium for Political and Social Research • Hosted by University of Michigan • ICPSR is premier source of secondary social science data • More than 7, 990 studies from 130 countries Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 27
Secondary Data Analysis U. S. Census Bureau • Census conducted every 10 years since 1790. • Contains questions about. . . – Household composition – Ethnicity – Income Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 28
Secondary Data Analysis Bureau of Labor Statistics (BLS) • Analyzes data on. . . – Employment – Earnings – Prices – Living conditions – Productivity and technology – Occupational safety and health Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 29
Secondary Data Analysis Human Relations Area Files • Hosted by Yale University • Information on more than 400 different groups. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 30
Secondary Data Analysis • Advantages of secondary data analysis: – Saves time and money – Avoids data collection problems – Facilitates comparison with other samples – Allows inclusion of more variables – Allows combination of data from multiple studies. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 31
Secondary Data Analysis • Disadvantages of secondary data analysis: – Cannot design methods best suited to answer research question – Cannot test and refine methods based on preliminary feedback – Data quality is always a concern – Different data collection systems across national boundaries Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 32
Big Data • Big data – Provides new method for investigating social world. – Google’s Ngrams • Sources of big data are increasing rapidly. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 33
Big Data Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 34
Big Data, Big Ethics Issues • Guidelines in reporting findings honestly: – Formulate hypotheses in advance of data collection. – Use grouping procedures that do not distort distribution’s basic shape. – Be modest about limitations. • Subject confidentiality is a major ethical concern. Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 35
Conclusion • Statistics are extremely useful for social scientists – Must be used appropriately – How they are interpreted and reported is important in determining usefulness Chambliss, Making Sense of the Social World, 6 e. © SAGE Publishing, 2020 36
- Slides: 36