CH 5 LAB CREATING BOX PLOTS AND RUNNING
CH 5 LAB CREATING BOX PLOTS AND RUNNING DESCRIPTIVE STATISTICS REPORTS
What Will I Learn in Ch 5 Lab? Displaying Quantitative Data: The Boxplot � � � What is a boxplot (AKA Box and Whisker Plot)? How do we read a boxplot How do we create a boxplot in Geo. Gebra?
The Five-Number Summary for a data set is composed of the following numbers: � Minimum, � Q 1, � Median, � Q 3, � Maximum � - (also called Q 2) The Five-Number Summary can be graphically represented using a Boxplot
Relationship between the boxplot and the 5 number summary Minimum Q 1 (the median of the lower half of the data) 1 Q 2 Median 2 Q 3 (the median of the upper half of the data) 3 4 Quartiles divide the data set into 4 equally sized groups. That is, each group has the same number of data values. Maximum
Constructing Boxplots by hand 1. Calculate the fivenumber summary. 2. Draw a horizontal axis that accommodates the max and min. 47 Label the axis. 30 The data set represents the number of meteorites found in 10 U. S states: 89, 47, 164, 296, 30, 215, 138, 78, 48 30, 39, 47, 48, 78, 89, 138, 164, 215, Max Min 296 Q Q MD 1 3 Ascending Order 83. 5 164 296 Number of Meteorites 3. Plot the max and min and draw a box with vertical sides through Q 1 and Q 3. Add a vertical line 4. Draw a line from the minimum data value to the left side of the box and a line from the maximum data value to the right side of the box.
Interpreting a boxplot � Shape – communicates characteristics about the data � Symmetry Boxplots that are symmetric have the median in the middle and the whiskers are same length. Boxplots that have data that is skewed will often have one whisker longer than the other. The direction of the skewness can typically be measured by comparing the length of the two sides. Imagine the median being the dividing line between both sides. The skew is in the direction of the longer side. � The whisker(s) Identify the minimum and maximum Allows to calculate range quickly � Spread – Interquartile Range (IQR) � � IQR represents the middle 50% of the data. It’s width will give the reader a good representation of how much variability (spread)the data has. Narrow box (IQR)– small amount of spread Wide box (IQR)– large amount of spread Recall that the 5 Number Summary divides our data into quartiles (each quartile contains 25% of the data. � Center � � � You can see the spread of each quartile by comparing their width/narrowness. Quartile 2 is median Mean cannot be seen in a boxplot Unusual Features � Skewed boxplots may contain outliers � Use the following calculation to determine if there are one or more outliers Q 1 – 1. 5(IQR) and Q 3 + 1. 5(IQR)
FACTS ABOUT BOXPLOTS What a Boxplot does tell you What a Boxplot does not tell you • Width of the box (IQR) tells us • Width of the box (IQR) does not where the middle 50% of the data tell us how much data is in the values are box • If the median splits the box into two unequal parts, the larger (wider) part contains data that is more spread out (variable) than the other portion of the box • Viewing a boxplot will not tell you how many data values or the sample size, n • Each quartile box has the same amount of data in it. The width of each quartile is an indicator of spread. • The mean and standard deviation are not easily determined visually when looking at the graph • Boxplots do not show bimodal distributions – Only a histogram would show that detail
Using the Interquartile Range to Identify Outliers � IQR = Q 3 Q 1
Let’s run through the process of identifying outliers in a data set
Find Q 1, Q 2, and Q 3 for the data set. 15, 13, 6, 5, 12, 50, 22, 18 Sort in ascending order. 5, 6, 12, 13, 15, 18, 22, 50 Identify the median AKA (Q 2) 5, 6, 12, 13, 15, 18, 22, 50 Identify Q 1 (The median of the lower half of the data set) 5, 6, 12, 13, 15, 18, 22, 50 Identify Q 3 (The median of the upper half of the data set) 5, 6, 12, 13, 15, 18, 22, 50 Alternatively, use technology to do the work for you!
Let’s Try Creating a Descriptive Statistics Report in Geogebra
� Using Technology: Descriptive statistics in Geo. Gebra Open Geo. Gebra and from the View menu select the Spreadsheet view. • Copy/paste data from another spreadsheet or handtype the values into the spreadsheet Highlight all values you want to graph • Verify that the correct data has been selected and click “Analyze”
Run descriptiv e statistics The summary report will look something like this. This is where you get your 5 number summary values
We found out in the previous slide that Q 1 = 9 & Q 3 = 20 IQR = Q 3 - Q 1 = 20 9 IQR = 11 Q 1 -1. 5(IQR) = 9 1. 5 (11) = - 7. 5 Q 3 + 1. 5(IQR) = 20 + 1. 5 (11) = 36. 5 In this case, 50 is an outlier
What do we do with outliers? What might it mean? Can we throw them out?
How to make a Boxplot in Geo. Gebra � � Modified box plot vs. regular box plot Geo. Gebra
How to create a Boxplot with Geo. Gebra � Open Geo. Gebra and choose the spreadsheet view from the View menu. • Copy/paste data from another spreadsheet or hand-type the values into the spreadsheet Highlight all values you want to graph • Select the “One Variable Analysis” tool • Verify that the correct data has been selected and click “Analyze”
From the drop-down menu choose Boxplot The Boxplot in Geogebra defaults to a modified Boxplot that shows the outlier with an X. If there are multiple outliers – there will be multiple X’s on the graph. Remember – not all data sets will have outliers.
To Show an Unmodified Boxplot (with no outliers showing) If you want to show unmodified (regular) Boxplot Select the arrow button Then uncheck “show outlier” box
Let’s Make Comparative Box. Plots and Compare the Measures of Variation That We Have Studied So Far. Comparison of the Number of Children Between Two Groups of Women. 1) Use Geogebra to Create a Box. Plot with a Descriptive Statistics Report. Number Of Group Children 1 Group 2 0 1 1 1 3 3 2 4 5 6 2)Compare the Following Between 7 The Two Groups: Range 8 IQR 9 Box. Plots 10 1 2 2 1
- Slides: 20