Boxplots and Outliers Barry Bonds set the major
Boxplots and Outliers Barry Bonds set the major league record by hitting 73 home runs in a single season in 2001. The dotplot below shows the number of home runs that Bonds hit in each of his 21 complete seasons: Bonds’s 73 home run season stands out (in red) from the rest of the distribution. Should this value be classified as an outlier? Statistics and Probability with Applications, 3 rd Edition 1
Boxplots and Outliers The interquartile range (IQR) is used as a ruler for identifying outliers. The 1. 5 x IQR Rule Call an observation an outlier if it falls more than 1. 5 × IQR above third quartile or below the first quartile. That is, Low Outliers < Q 1 – 1. 5 × IQR High Outliers > Q 3 + 1. 5 × IQR Statistics and Probability with Applications, 3 rd Edition 2
Boxplots and Outliers Important reasons to identify outliers in a distribution: 1. They might be inaccurate data values. 2. They can indicate a remarkable occurrence. 3. They can heavily influence the values of some summary statistics, like the mean, range, and standard deviation. Statistics and Probability with Applications, 3 rd Edition 3
Identifying Outliers Statistics and Probability with Applications, 3 rd Edition 4
Boxplots and Outliers Boxplot (sometimes called a box-and-whisker plot) - A boxplot summarizes a distribution by displaying the location of 5 important values within the distribution, known as its five-number summary. Five-Number Summary, Boxplot The five-number summary of a distribution of quantitative data consists of the minimum, the first quartile Q 1, the median, the third quartile Q 3, and the maximum. A boxplot is a visual representation of the five-number summary. Statistics and Probability with Applications, 3 rd Edition 5
Boxplots and Outliers A boxplot is a visual representation of the five-number summary. How to Make a Boxplot 1. Find the five-number summary for the distribution. 2. Draw and label the axis. Draw a horizontal axis and put the name of the quantitative variable underneath. 3. Scale the axis. Look at the smallest and largest values in the data set. Start the horizontal axis at a number equal to or below the smallest value and place tick marks at equal intervals until you equal or exceed the largest value. 4. Draw a box that spans from the first quartile (Q 1) to the third quartile (Q 3). 5. Mark the median with a vertical line segment that’s the same height as the box. 6. Identify outliers using the 1. 5 × IQR rule. 7. Draw whiskers—lines that extend from the ends of the box to the smallest and largest data values that are not outliers. Mark any outliers with a special symbol such as an asterisk (*). Statistics and Probability with Applications, 3 rd Edition 6
Creating A Boxplot Statistics and Probability with Applications, 3 rd Edition 7
Boxplots and Outliers Boxplots provide a quick summary of the center and variability of a distribution. Boxplots do not display each individual value in a distribution. And boxplots don’t show gaps, clusters, or peaks. Statistics and Probability with Applications, 3 rd Edition 8
Boxplots and Outliers Boxplots are especially effective for comparing the distribution of a quantitative variable in two or more groups. Things to Compare: 1. Shape 2. Center 3. Variability 4. Outliers Statistics and Probability with Applications, 3 rd Edition 9
LESSON APP 1. 8 Which is best at reducing stress? If you are a dog lover, having your dog with you may reduce your stress level. Does having a friend with you reduce stress? To examine the effect of pets and friends in stressful situations, researchers recruited 45 women who said they were dog lovers. Fifteen women were assigned at random to each of three groups: to do a stressful task (1) alone, (2) with a good friend present, or (3) with their dogs present. The stressful task was to count backward by 13 s or 17 s. The woman’s average heart rate during the task was one measure of the effect of stress. The following table shows the data. 1. Identify any outliers in the three groups. Show your work. 2. Make parallel boxplots to compare the heart rates of the women in the three groups. 3. Based on the data, does it appear that the presence of a pet or friend reduces heart rate during a stressful task? Justify your answer. Statistics and Probability with Applications, 3 rd Edition 10
- Slides: 10