AP Stats BW 919 Below is a list

AP Stats BW 9/19 Below is a list of gas mileage ratings for selected passenger cars in miles per gallon. Choose the correct histogram of the data. 53, 43, 89, 41, 85, 86, 91, 92, 95, 94, 86, 102, 114, 30, 123

Section 2. 4 Part 3 Chebychev’s Theorem & Grouped Data SWBAT: Identify and analyze patterns of distributions using shape, center and spread.

Distributions: Normal vs. Not or Unknown • Use EMPIRICAL RULE for normal (symmetric) distributions • Use CHEBYCHEV’S THEOREM for ALL distributions.

CHEBYCHEV’S THEOREM The portion of any data set lying within k standard deviations (k>1) of the mean is at least: • k represents the number of standard deviations from the mean • k = 2: 75% of the data lie within 2σ of the mean • k = 3: 88. 9%of the data lie within 3σ of the mean In general, Chebychev’s Theorem gives the minimum % of data that fall within the given # of standard deviations.

CHEBYCHEV’S THEOREM - Example The age distributions for Alaska and Florida are given. Which is which? What conclusions can you reach using Chebychev’s Theorem?

CHEBYCHEV’S THEOREM – Example Solution Left Graph: μ – 2σ is negative μ + 2σ = 31. 6 + 2(19. 5) = 70. 6; thus 75% of the population is between 0 and 70. 6 years old. Right Graph: μ – 2σ is negative μ + 2σ = 39. 2 + 2(24. 8) = 88. 8; thus 75% of the population is between 0 and 88. 8 years old, ie 75% of the population is under the age of 88. 8. Because the population is higher and they are older, we known the right graph is Florida, and the left is Alaska We would actually get more specific data using the histogram or a relative frequency histogram.

Standard Deviation for Grouped Data Earlier we found sample mean and standard deviation by creating a frequency distribution table to organize the data then using the formula for sample standard deviation: x 15 16 19 20 20 x–μ 15 – 18 = -3 -2 1 2 2 (x – μ )2 9 4 1 4 4 22

Standard Deviation for Grouped Data, cont’d If data sets are larger and contain repeated values, you can create a relative frequency distribution to group the data: Ex: # of children in 50 households 1, 3, 1, 1, 2, 2, 1, 0, 1, 1, 0, 0, 0, 1, 5, 0, 3, 6, 3, 0, 3, 1, 1, 6, 0, 1, 3, 6, 6, 1, 2, 2, 3, 0, 1, 1, 4, 1, 1, 2, 2, 0, 3, 0, 2, 4 x 0 1 2 3 4 5 6 f 10 19 7 7 2 1 4 ∑ = 50 xf 0 19 14 21 8 5 24 ∑ = 91 x–‾ -1. 8 -0. 8 0. 2 1. 2 2. 2 3. 2 4. 2 (x – ‾)2 3. 24 0. 64 0. 04 1. 44 4. 84 10. 24 17. 64 (x – ‾)2 f 32. 40 12. 16 0. 28 10. 08 9. 68 10. 24 70. 56 ∑ = 145. 4

Let’s use a calculator instead! • Enter the values of x into L 1 • Enter the frequencies, f, into L 2 • Select STAT • Select CALC 1: 1 -Var Stats • Enter 2 nd L 1 , 2 nd L 2 x 0 1 2 3 4 5 6 f 10 19 7 7 2 1 4 ∑ = 50

Standard Deviation for Grouped Data, cont’d When a frequency distribution has classes, you can estimate the sample mean and standard deviation using the midpoint of each class. Class f x (midpt) ∑= xf ∑= x–‾ x (x – ‾) x 2 f ∑=

Using midpoints Example The graph shows the results of a survey of 1000 adults. Find the mean and standard deviation.

Using midpoints Example The graph shows the results of a survey of 1000 adults. Find the mean and standard deviation. Class 0 -99 100 -199 200 -299 300 -399 400 -499 500+ x (midpt) 49. 5 149. 5 249. 5 349. 5 449. 5 599. 5 f xf 380 18810 230 34385 210 52395 50 17475 60 26970 70 41965 ∑ = 1000 ∑ = 192, 000 $192 per year x – x‾ -142. 5 -42, 5 57. 5 157. 5 257. 5 407. 5 (x – x‾)2 20306. 25 1806. 25 3306. 25 24806. 25 66306. 25 166056. 25 (x – ‾) x 2 f 7, 716, 375 415, 437. 5 694, 312. 5 1, 240, 312. 5 3, 978, 375 11, 623, 937. 5 ∑ = 25, 668. 750 $160. 3 per year

You try…. In the example, we used 599. 5 as the midpoint for the class of $500+. How would the sample mean and standard deviation change if you used 650 to represent the class? Find the mean and standard deviation. Sample mean is $195. 5 per year, and the sample standard deviation is about $169. 5 per year.

HOMEWORK: P 92: 13, 19, 29, 37, 45 - 51 all