MEASURES OF VARIABILITY Variance Population variance Sample variance
MEASURES OF VARIABILITY • Variance – Population variance – Sample variance • Standard Deviation – Population standard deviation – Sample standard deviation • Coefficient of Variation (CV) – Sample CV – Population CV 1
MEASURES OF VARIABILITY POPULATION VARIANCE • The population variance is the mean squared deviation from the population mean: • Where 2 stands for the population variance • is the population mean • N is the total number of values in the population • is the value of the i-th observation. • represents a summation 2
MEASURES OF VARIABILITY SAMPLE VARIANCE • The sample variance is defined as follows: • Where s 2 stands for the sample variance • is the sample mean • n is the total number of values in the sample • is the value of the i-th observation. • represents a summation 3
MEASURES OF VARIABILITY SAMPLE VARIANCE • A sample of monthly advertising expenses (in 000$) is taken. The data for five months are as follows: 2. 5, 1. 3, 1. 4, 1. 0 and 2. 0. Compute the sample variance. 4
MEASURES OF VARIABILITY SAMPLE VARIANCE • Notice that the sample variance is defined as the sum of the squared deviations divided by n-1. • Sample variance is computed to estimate the population variance. • An unbiased estimate of the population variance may be obtained by defining the sample variance as the sum of the squared deviations divided by n-1 rather than by n. • Defining sample variance as the mean squared deviation from the sample mean tends to underestimate the population variance. 5
MEASURES OF VARIABILITY SAMPLE VARIANCE • A shortcut formula for the sample variance: • Where s 2 is the sample variance • n is the total number of values in the sample • is the value of the i-th observation. • represents a summation 6
MEASURES OF VARIABILITY SAMPLE VARIANCE • A sample of monthly sales expenses (in 000 units) is taken. The data for five months are as follows: 264, 116, 165, 101 and 209. Compute the sample variance using the short-cut formula. 7
MEASURES OF VARIABILITY SAMPLE VARIANCE • The shortcut formula for the sample variance: • If you have the sum of the measurements already computed, the above formula is a shortcut because you need only to compute the sum of the squares, 8
MEASURES OF VARIABILITY POPULATION/SAMPLE STANDARD DEVIATION • The standard deviation is the positive square root of the variance: Population standard deviation: Sample standard deviation: • Compute the standard deviations of advertising and sales. 9
MEASURES OF VARIABILITY POPULATION/SAMPLE STANDARD DEVIATION • Compute the sample standard deviation of advertising data: 2. 5, 1. 3, 1. 4, 1. 0 and 2. 0 • Compute the sample standard deviation of sales data: 264, 116, 165, 101 and 209 10
MEASURES OF VARIABILITY POPULATION/SAMPLE CV • The coefficient of variation is the standard deviation divided by the means Population coefficient of variation: Sample coefficient of variation: 11
MEASURES OF VARIABILITY POPULATION/SAMPLE CV • Compute the sample coefficient of variation of advertising data: 2. 5, 1. 3, 1. 4, 1. 0 and 2. 0 • Compute the sample coefficient of variation of sales data: 264, 116, 165, 101 and 209 12
MEASURES OF ASSOCIATION • Scatter diagram plot provides a graphical description of positive/negative, linear/non-linear relationship • Some numerical description of the positive/negative, linear/non-linear relationship are obtained by: – Covariance • Population covariance • Sample covariance – Coefficient of correlation • Population coefficient of correlation • Sample coefficient of correlation 13
MEASURES OF ASSOCIATION: EXAMPLE • A sample of monthly advertising and sales data are collected and shown below: Month Sales (000 units) Advertising (000 $) 1 2 3 4 5 264 116 165 101 209 2. 5 1. 3 1. 4 1. 0 2. 0 • How is the relationship between sales and advertising? Is the relationship linear/non-linear, positive/negative, etc. 14
POPULATION COVARIANCE • The population covariance is mean of products of deviations from the population mean: • Where COV(X, Y) is the population covariance • x, y are the population means of X and Y respectively • N is the total number of values in the population • are the values of the i-th observations of X and Y respectively. 15 • represents a summation
SAMPLE COVARIANCE • The sample covariance is mean of products of deviations from the sample mean: • Where cov(X, Y) is the sample covariance • are the sample means of X and Y respectively • n is the total number of values in the population • are the values of the i-th observations of X and Y respectively. 16 • represents a summation
SAMPLE COVARIANCE 17
POPULATION/SAMPLE COVARIANCE • If two variables increase/decrease together, covariance is a large positive number and the relationship is called positive. • If the relationship is such that when one variable increases, the other decreases and vice versa, then covariance is a large negative number and the relationship is called negative. • If two variables are unrelated, the covariance may be a small number. • How large is large? How small is small? 18
POPULATION/SAMPLE COVARIANCE • How large is large? How small is small? A drawback of covariance is that it is usually difficult to provide any guideline how large covariance shows a strong relationship and how small covariance shows no relationship. • Coefficient of correlation can overcome this drawback to a certain extent. 19
POPULATION COEFFICIENT OF CORRELATION • The population coefficient of correlation is the population covariance divided by the population standard deviations of X and Y: • Where is the population coefficient of correlation • COV(X, Y) is the population covariance • x, y are the population means of X and Y respectively 20
SAMPLE COEFFICIENT OF CORRELATION • The sample coefficient of correlation is the sample covariance divided by the sample standard deviations of X and Y: • Where r is the sample coefficient of correlation • cov(X, Y) is the sample covariance • sx, sy are the sample means of X and Y respectively 21
SAMPLE COEFFICIENT OF CORRELATION 22
POPULATION/SAMPLE COEFFICIENT OF CORRELATION • The coefficient of correlation is always between -1 and +1. – Values near -1 or +1 show strong relationship – Values near 0 show no relationship’ – Values near 1 show strong positive linear relationship – Values near -1 show strong negative linear relationship 23
EXAMPLE • Salary and expenses for cultural activities, and sports related activities are collected from 100 households. Data of only 5 households shown below: How are the relationships (linear/nonlinear, positive/negative) between (i) salary and culture, (ii) salary and sports, and (iii) sports and culture? 24
cov = 1094787, r = 0. 5065 (positive, linear) 25
cov = -33608, r = -0. 5201 (negative, linear) 26
cov = -219026, r = -0. 08122 (no linear relationship) 27
- Slides: 27