Exploratory Data Analysis EDA Displaying Quantitative Data with
- Slides: 16
Exploratory Data Analysis EDA Displaying Quantitative Data with Graphs Chapter 1. 2 Stemplots and Back-to-Back Stemplots
Quantitative Data Display When displaying Categorical Data we use Pie Charts and/or Bar Graphs To display qualitative data we use A. Dotplots Stemplots C. Histograms D. Box Plots B. Ø When analyzing the data we will describe the overall pattern (Shape, Center, and Spread) of the distribution
A) Dotplot Small datasets with a small range (max-min) can be easily displayed using a dotplot Draw and Label a number line from min to max Place one dot per observation above its value Stuck multiple observations evenly Describe Shape SOCS I. Shape: peak, cluster, tail direction, symmetry and skewed distribution II. Center: midpoint, III. Spread: range min-max IV. Outliers: stand out values from the overall distribution
S- Shape: Symmetry and Skew A distribution is roughly symmetric if the right and left sides of the graph are approximately mirror images of each other. A distribution is skewed to the right (right-skewed) if the right side of the graph (containing the half of the observations with larger values) is much longer than the left side. It is skewed to the left (left-skewed) if the left side of the graph is much longer than the right side. Symmetric Skewed-left Skewed-right
Comparing Distributions In statistics we are mostly interested in Comparing two or more groups. v Which diet of the two works best? v Which North American University should one attend? v What improves memory? …. and ofcourse Who gets more detentions per year in CDS: girls or boys?
Compare the distribution of Household size for U. K. and South Africa. Don’t forget your SOCS SHAPE? CENTER (midpoint) SPREAD OUTLIERS
TRY: Energy Cost Example Top VS Bottom Freezers How do the annual energy costs (in dollars) compare for refrigerators with top freezers and refrigerators with bottom freezers? The data below are from the May 2010 issue of Consumer Reports. Problem: Compare the distributions of energy cost for these two types of refrigerators.
B)
How MANY pairs of shoes does a typical teenager have? Random sample of 20 students
In the case of males?
Back-to-Back Stemplots • When comparing use back-to-back stemplots • If data is bunched-up split stems Females Males 50 26 26 31 57 19 24 22 23 38 14 7 6 5 12 38 8 7 10 10 13 50 13 34 23 30 49 13 15 51 10 11 4 5 22 7 5 10 35 7 0 0 1 1 2 2 3 3 4 4 5 5 Females “split stems” 333 95 4332 66 410 8 9 100 7 Males 0 0 1 1 2 2 3 3 4 4 5 5 4 555677778 0000124 2 58 Key: 4|9 represents a student who reported having 49 pairs of shoes.
AP Exam Common Errors When describing a distribution students forget to address all 4 characteristics of SOCS When comparing not explicitly comparing the characteristics. Discussing the SOCS for each distribution separately WILL NOT give partial credit Ø Ø Use phrases like “about the same as” Is much greater than When making stemplots: forgetting the Key or labels
TRY: Back-to-Back Stemplot Who’s Taller? Who is taller, males or females? A sample of 14 -year-olds from the United Kingdom was randomly selected using the Census. At. School Web site. Here are the heights of the students (in cm): Male: 154, 157, 187, 163, 167, 159, 162, 176, 177, 151, 175, 174, 165, 183, 180 Female: 160, 169, 152, 167, 164, 163, 160, 163, 169, 157, 158, 153, 161, 165, 159, 168, 153, 166, 158, 166
TRY these ones:
Group Work: MOTIVATING SOCS
Homework!!! 1. pg- 41 #37 - #49 odd 2. Must Bring your Graphing Calculator 3. Read ahead if you can: page 33 -40
- Quantitative data graphs
- Chapter 3 exploring quantitative data answers
- Displaying quantitative data
- Exploratory research quantitative
- Exploratory data analysis lecture notes
- Definition of eda
- Data preparing exploring examining and displaying
- Collecting and displaying data
- Chapter 3 displaying and describing categorical data
- Organizing and displaying data
- Oracle select from multiple tables
- Organizing and displaying data
- Exploratory factor analysis
- Exploratory cluster analysis
- Group studied is smaller and not randomly selected
- Qualitative vs quantitative
- Qualitative and quantitative data analysis