Introduction to Statistics Chapter 3 Using Statistics to

  • Slides: 32
Download presentation
Introduction to Statistics Chapter 3 Using Statistics to summarize Data sets Business Statistics: A

Introduction to Statistics Chapter 3 Using Statistics to summarize Data sets Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. Chap 3 -1

Chapter Goals After completing this chapter, you should be able to: n n Compute

Chapter Goals After completing this chapter, you should be able to: n n Compute and interpret the mean, median, and mode for a set of data Compute the range, variance, and standard deviation and know what these values mean Compute and explain the coefficient of variation Use the empirical to describe the shape of data sets Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 2

Chapter Topics n Measures of Center and Location n n Other measures of Location

Chapter Topics n Measures of Center and Location n n Other measures of Location n n Mean, median, mode Weighted mean, percentiles, quartiles Measures of Variation n Range, interquartile range, variance and standard deviation, coefficient of variation Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 3

Summary Measures Describing Data Numerically Center and Location Other Measures of Location Mean Median

Summary Measures Describing Data Numerically Center and Location Other Measures of Location Mean Median Mode Variation Range Percentiles Interquartile Range Quartiles Weighted Mean Variance Standard Deviation Coefficient of Variation Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 4

Measures of Center and Location Overview Center and Location Mean Median Business Statistics: A

Measures of Center and Location Overview Center and Location Mean Median Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. Mode Weighted Mean 5

Mean (Arithmetic Average) n The Mean is the arithmetic average of data values n

Mean (Arithmetic Average) n The Mean is the arithmetic average of data values n Sample mean n = Sample Size n Population mean N = Population Size Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 6

Mean (Arithmetic Average) (continued) n n n The most common measure of central tendency

Mean (Arithmetic Average) (continued) n n n The most common measure of central tendency Mean = sum of values divided by the number of values Affected by extreme values (outliers) 0 1 2 3 4 5 6 7 8 9 10 Mean = 3 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 0 1 2 3 4 5 6 7 8 9 10 Mean = 4 7

Median n Not affected by extreme values 0 1 2 3 4 5 6

Median n Not affected by extreme values 0 1 2 3 4 5 6 7 8 9 10 Median = 3 n In an ordered array, the median is the “middle” number n n If n or N is odd, the median is the middle number If n or N is even, the median is the average of the two middle numbers Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 8

Mode n n n A measure of central tendency Value that occurs most often

Mode n n n A measure of central tendency Value that occurs most often Not affected by extreme values Used for either numerical or categorical data There may be no mode There may be several modes 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Mode = 5 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 0 1 2 3 4 5 6 No Mode 9

Weighted Mean n Used when values are grouped by frequency or relative importance Example:

Weighted Mean n Used when values are grouped by frequency or relative importance Example: Sample of 26 Repair Projects Days to Complete Frequency 5 4 6 12 7 8 8 2 Weighted Mean Days to Complete: Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 10

Review Example n Five houses on a hill by the beach House Prices: $2,

Review Example n Five houses on a hill by the beach House Prices: $2, 000 500, 000 300, 000 100, 000 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 11

Summary Statistics House Prices: $2, 000 500, 000 300, 000 100, 000 n n

Summary Statistics House Prices: $2, 000 500, 000 300, 000 100, 000 n n Sum 3, 000 n Mean: ($3, 000/5) = $600, 000 Median: middle value of ranked data = $300, 000 Mode: most frequent value = $100, 000 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 12

Which measure of location is the “best”? n n Mean is generally used, unless

Which measure of location is the “best”? n n Mean is generally used, unless extreme values (outliers) exist Then median is often used, since the median is not sensitive to extreme values. n Example: Median home prices may be reported for a region – less sensitive to outliers Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 13

Shape of a Distribution n Describes how data is distributed n Symmetric or skewed

Shape of a Distribution n Describes how data is distributed n Symmetric or skewed Left-Skewed Symmetric Right-Skewed Mean < Median < Mode Mean = Median = Mode < Median < Mean (Longer tail extends to left) Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. (Longer tail extends to right) 14

Other Location Measures Other Measures of Location Percentiles The pth percentile in a data

Other Location Measures Other Measures of Location Percentiles The pth percentile in a data array: n n p% are less than or equal to this value (100 – p)% are greater than or equal to this value (where 0 ≤ p ≤ 100) Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. Quartiles n n n 1 st quartile = 25 th percentile 2 nd quartile = 50 th percentile = median 3 rd quartile = 75 th percentile 15

Percentiles n n The pth percentile in an ordered array of n values is

Percentiles n n The pth percentile in an ordered array of n values is the value in ith position, where Example: The 60 th percentile in an ordered array of 19 values is the value in 12 th position: Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 16

Quartiles n Quartiles split the ranked data into 4 equal groups 25% 25% Q

Quartiles n Quartiles split the ranked data into 4 equal groups 25% 25% Q 1 n Q 2 Q 3 Example: Find the first quartile Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22 (n = 9) 25 (9+1) = 2. 5 position 100 Q 1 = 25 th percentile, so find the so use the value half way between the 2 nd and 3 rd values, so Q 1 = 12. 5 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 17

Measures of Variation Range Interquartile Range Variance Standard Deviation Population Variance Population Standard Deviation

Measures of Variation Range Interquartile Range Variance Standard Deviation Population Variance Population Standard Deviation Sample Variance Sample Standard Deviation Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. Coefficient of Variation 18

Variation n Measures of variation give information on the spread or variability of the

Variation n Measures of variation give information on the spread or variability of the data values. Same center, different variation Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 19

Range n n Simplest measure of variation Difference between the largest and the smallest

Range n n Simplest measure of variation Difference between the largest and the smallest observations: Range = xmaximum – xminimum Example: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Range = 14 - 1 = 13 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 20

Disadvantages of the Range n Ignores the way in which data are distributed 7

Disadvantages of the Range n Ignores the way in which data are distributed 7 8 9 10 11 12 Range = 12 - 7 = 5 n 7 8 9 10 11 12 Range = 12 - 7 = 5 Sensitive to outliers 1, 1, 1, 2, 2, 3, 3, 4, 5 Range = 5 - 1 = 4 1, 1, 1, 2, 2, 3, 3, 4, 120 Range = 120 - 1 = 119 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 21

Interquartile Range n n n Can eliminate some outlier problems by using the interquartile

Interquartile Range n n n Can eliminate some outlier problems by using the interquartile range Eliminate some high-and low-valued observations and calculate the range from the remaining values. Interquartile range = 3 rd quartile – 1 st quartile Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 22

Interquartile Range Example: X minimum Q 1 25% 12 Median (Q 2) 25% 30

Interquartile Range Example: X minimum Q 1 25% 12 Median (Q 2) 25% 30 25% 45 X Q 3 maximum 25% 57 70 Interquartile range = 57 – 30 = 27 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 23

Variance n Average of squared deviations of values from the mean n Sample variance:

Variance n Average of squared deviations of values from the mean n Sample variance: n Population variance: Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 24

Standard Deviation n Most commonly used measure of variation Shows variation about the mean

Standard Deviation n Most commonly used measure of variation Shows variation about the mean Has the same units as the original data n Sample standard deviation: n Population standard deviation: Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 25

Calculation Example: Sample Standard Deviation Sample Data (Xi) : 10 12 n=8 14 15

Calculation Example: Sample Standard Deviation Sample Data (Xi) : 10 12 n=8 14 15 17 18 18 24 Mean = x = 16 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 26

Comparing Standard Deviations Data A 11 12 13 14 15 16 17 18 19

Comparing Standard Deviations Data A 11 12 13 14 15 16 17 18 19 20 21 Mean = 15. 5 s = 3. 338 20 21 Mean = 15. 5 s =. 9258 20 21 Mean = 15. 5 s = 4. 57 Data B 11 12 13 14 15 16 17 18 19 Data C 11 12 13 14 15 16 17 18 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 19 27

Coefficient of Variation n Measures relative variation n Always in percentage (%) n Shows

Coefficient of Variation n Measures relative variation n Always in percentage (%) n Shows variation relative to mean n Is used to compare two or more sets of data measured in different units Population Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. Sample 28

Comparing Coefficient of Variation n n Stock A: n Average price last year =

Comparing Coefficient of Variation n n Stock A: n Average price last year = $50 n Standard deviation = $5 Stock B: n n Average price last year = $100 Standard deviation = $5 Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. Both stocks have the same standard deviation, but stock B is less variable relative to its price 29

The Empirical Rule n n If the data distribution is bell-shaped, then the interval:

The Empirical Rule n n If the data distribution is bell-shaped, then the interval: contains about 68% of the values in the population or the sample X 68% Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 30

The Empirical Rule n n contains about 95% of the values in the population

The Empirical Rule n n contains about 95% of the values in the population or the sample contains about 99. 7% of the values in the population or the sample 95% Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 99. 7% 31

Chapter Summary n Described measures of center and location n Mean, median, mode n

Chapter Summary n Described measures of center and location n Mean, median, mode n Discussed percentiles and quartiles n Described measure of variation n Range, interquartile range, variance, standard deviation, coefficient of variation n Described the empirical rule Business Statistics: A Decision-Making Approach, 6 e © 2005 Prentice-Hall, Inc. 32