The two sample problem Univariate Inference Let x
The two sample problem
Univariate Inference Let x 1, x 2, … , xn denote a sample of n from the normal distribution with mean mx and variance s 2. Let y 1, y 2, … , ym denote a sample of n from the normal distribution with mean my and variance s 2. Suppose we want to test H 0: mx = my vs HA : mx ≠ my
The appropriate test is the t test: The test statistic: Reject H 0 if |t| > ta/2 d. f. = n + m -2
The multivariate Test Let denote a sample of n from the p-variate normal distribution with mean vector and covariance matrix S. Let denote a sample of m from the p-variate normal distribution with mean vector and covariance matrix S. Suppose we want to test
Hotelling’s T 2 statistic for the two sample problem if H 0 is true than has an F distribution with n 1 = p and n 2 = n +m – p - 1
Thus Hotelling’s T 2 test We reject
Simultaneous inference for the two-sample problem • Hotelling’s T 2 statistic can be shown to have been derived by Roy’s Union-Intersection principle
Thus
Thus
Thus Hence
Thus form 1 – a simultaneous confidence intervals for
Example Annual financial data are collected for firms approximately 2 years prior to bankruptcy and for financially sound firms at about the same point in time. The data on the four variables • x 1 = CF/TD = (cash flow)/(total debt), • x 2 = NI/TA = (net income)/(Total assets), • x 3 = CA/CL = (current assets)/(current liabilties, and • x 4 = CA/NS = (current assets)/(net sales) are given in the following table.
The data are given in the following table:
Hotelling’s T 2 test A graphical explanation
Hotelling’s T 2 statistic for the two sample problem
is the test statistic for testing:
Hotelling’s T 2 test X 2 Popn A Popn B X 1
X 2 Univariate test for X 1 Popn A Popn B X 1
X 2 Univariate test for X 2 Popn A Popn B X 1
X 2 Univariate test for a 1 X 1 + a 2 X 2 Popn A Popn B X 1
Mahalanobis distance A graphical explanation
Euclidean distance
Mahalanobis distance: S, a covariance matrix
Hotelling’s T 2 statistic for the two sample problem
Case I X 2 Popn A Popn B X 1
Case II X 2 Popn A Popn B X 1
In Case I the Mahalanobis distance between the mean vectors is larger than in Case II, even though the Euclidean distance is smaller. In Case I there is more separation between the two bivariate normal distributions
- Slides: 27