Variation Range n n Variance Standard Deviation Coefficient

  • Slides: 42
Download presentation

常用的離散趨勢指標 Variation Range n n Variance Standard Deviation Coefficient of Variation Measures of variation

常用的離散趨勢指標 Variation Range n n Variance Standard Deviation Coefficient of Variation Measures of variation give information on the spread or variability or dispersion of the data values. 離散或變異性測度是用來衡量觀察資 料之間的分散程度。 Same center, different variation

Range Simplest measure of variation. Difference between the largest and the smallest values in

Range Simplest measure of variation. Difference between the largest and the smallest values in a set of data: Range = Xlargest – Xsmallest Example: 0 1 2 3 4 5 6 7 8 9 10 11 12 Range = 14 - 1 = 13 13 14

Disadvantages of the Range Ignores the way in which data are distributed. 7 8

Disadvantages of the Range Ignores the way in which data are distributed. 7 8 9 10 11 12 Range = 7 8 9 Range = Sensitive to outliers. 1, 1, 1, 2, 2, 3, 3, 4, 5 Range = 1, 1, 1, 2, 2, 3, 3, 4, 120 Range = 10 11 12

Interquartile Range Can eliminate some outlier problems by using the interquartile range. Eliminate some

Interquartile Range Can eliminate some outlier problems by using the interquartile range. Eliminate some high- and low-valued observations and calculate the range from the remaining values. Interquartile range = 3 rd quartile – 1 st quartile = Q 3 – Q 1

Interquartile Range Example: Median (Q 2) minimum 12 30 Q 1= 45 maximum 57

Interquartile Range Example: Median (Q 2) minimum 12 30 Q 1= 45 maximum 57 70 Q 3= Interquartile range =

Variance Average (approximately) of squared deviations of values from the mean Sample variance: Where

Variance Average (approximately) of squared deviations of values from the mean Sample variance: Where = sample mean n = sample size Xi = ith value of the variable X n n n

Standard Deviation It is the most commonly used measure of variation It shows variation

Standard Deviation It is the most commonly used measure of variation It shows variation about the mean It is the square root of the variance It has the same units as the original data Sample standard deviation:

Example: Sample Standard Deviation Sample Data (Xi) : 10 12 n=8 14 15 X

Example: Sample Standard Deviation Sample Data (Xi) : 10 12 n=8 14 15 X = 16 17 18 18 24

Comparing Standard Deviations Data A Mean = 15. 5 11 12 13 14 15

Comparing Standard Deviations Data A Mean = 15. 5 11 12 13 14 15 16 17 18 19 20 21 S = 3. 338 Data B Mean = 15. 5 11 12 13 14 15 16 17 18 19 20 21 S = 0. 926 Data C Mean = 15. 5 11 12 13 14 15 16 17 18 19 20 21 S = 4. 567

Advantages of Variance and Standard Deviation Each value in the data set is used

Advantages of Variance and Standard Deviation Each value in the data set is used in the calculation. Values far from the mean are given extra weight. (because deviations from the mean are squared)

Measures of Variation: Summary Characteristics l l The more the data are spread out,

Measures of Variation: Summary Characteristics l l The more the data are spread out, the range, variance, and standard deviation. the The more the data are concentrated, the range, variance, and standard deviation. the If the values are all the same (no variation), all these measures will be. None of these measures are ever .

Coefficient of Variation 變異係數 Shows variation relative to mean. Measures relative variation. Always in

Coefficient of Variation 變異係數 Shows variation relative to mean. Measures relative variation. Always in percentage (%). Can be used to compare two or more sets of data measured in units.

Comparing Coefficient of Variation Stock A: Average price last year = $50 Standard deviation

Comparing Coefficient of Variation Stock A: Average price last year = $50 Standard deviation = $5 Stock B: Average price last year = $100 Standard deviation = $5 Both stocks have the same standard deviation, but stock B is less variable relative to its price

Z Scores The z-score for an item, indicates how far and in what direction,

Z Scores The z-score for an item, indicates how far and in what direction, that item deviates from its , expressed in units of. A measure of distance from the mean (for example, a Z-score of 2. 0 means that a value is 2. 0 standard deviations from the mean). The difference between a value and the mean, divided by the standard deviation. A z-score above 3. 0 or below -3. 0 is considered an outlier.

Z Scores (continued) Example: If the mean is 14. 0 and the standard deviation

Z Scores (continued) Example: If the mean is 14. 0 and the standard deviation is 3. 0, what is the Z score for the value 18. 5? The value 18. 5 is 1. 5 standard deviations above the mean. A negative Z-score would mean that a value is less than the mean.

Locating Extreme Outliers: Z-Score l To compute the Z-score of a data value, subtract

Locating Extreme Outliers: Z-Score l To compute the Z-score of a data value, subtract the mean and divide by the standard deviation. l The Z-score is the number of standard deviations a data value is from the mean. l A data value is considered an extreme outlier if its Z-score is less than or greater than . l The larger the absolute value of the Z-score, the farther the data value is from the mean.

Locating Extreme Outliers: Z-Score § Suppose the mean math SAT score is 490, with

Locating Extreme Outliers: Z-Score § Suppose the mean math SAT score is 490, with a standard deviation of 100. § Compute the Z-score for a test score of 620. A score of 620 is 1. 3 standard deviations above the mean and would not be considered an outlier.

Shape of a Distribution Describes how data are distributed Measures of shape Symmetric or

Shape of a Distribution Describes how data are distributed Measures of shape Symmetric or skewed

Excel 的使用

Excel 的使用

Numerical Measures for a Population summary measures are called The population mean is the

Numerical Measures for a Population summary measures are called The population mean is the sum of the values in the population divided by the population size, N. Where μ = population mean N = population size Xi = ith value of the variable X .

Population Variance Average of squared deviations of values from the mean. Population variance: Where

Population Variance Average of squared deviations of values from the mean. Population variance: Where μ = population mean N = population size Xi = ith value of the variable X

Population Standard Deviation Most commonly used measure of variation. Shows variation about the mean.

Population Standard Deviation Most commonly used measure of variation. Shows variation about the mean. Is the square root of the population variance. Has the same units as the original data. Population standard deviation:

The Empirical Rule The empirical rule approximates the variation of data in a distribution.

The Empirical Rule The empirical rule approximates the variation of data in a distribution. Approximately of the data in a bell shaped distribution is within 1 standard deviation of the mean or. 68%

The Empirical Rule Approximately of the data in a bell-shaped distribution lies within two

The Empirical Rule Approximately of the data in a bell-shaped distribution lies within two standard deviations of the mean, or µ ± 2σ Approximately of the data in a bell-shaped distribution lies within three standard deviations of the mean, or µ ± 3σ

Using the Empirical Rule § Suppose that the variable Math SAT scores is bell-

Using the Empirical Rule § Suppose that the variable Math SAT scores is bell- shaped with a mean of 500 and a standard deviation of 90. Then,

Chebyshev Rule (柴比雪夫定理) Regardless of how the data are distributed, at least (1 -

Chebyshev Rule (柴比雪夫定理) Regardless of how the data are distributed, at least (1 - 1/k 2) x 100% of the values will fall within k standard deviations of the mean (for k > 1) Example: At least within