The two sample problem Univariate Inference Let x

The two sample problem

Univariate Inference Let x 1, x 2, … , xn denote a sample of n from the normal distribution with mean mx and variance s 2. Let y 1, y 2, … , ym denote a sample of n from the normal distribution with mean my and variance s 2. Suppose we want to test H 0: mx = my vs HA : mx ≠ my

The appropriate test is the t test: The test statistic: Reject H 0 if |t| > ta/2 d. f. = n + m -2

The multivariate Test Let denote a sample of n from the p-variate normal distribution with mean vector and covariance matrix S. Let denote a sample of m from the p-variate normal distribution with mean vector and covariance matrix S. Suppose we want to test

Hotelling’s T 2 statistic for the two sample problem if H 0 is true than has an F distribution with n 1 = p and n 2 = n +m – p - 1

Thus Hotelling’s T 2 test We reject

Simultaneous inference for the two-sample problem • Hotelling’s T 2 statistic can be shown to have been derived by Roy’s Union-Intersection principle

Thus

Thus Hence

Thus form 1 – a simultaneous confidence intervals for

Example Annual financial data are collected for firms approximately 2 years prior to bankruptcy and for financially sound firms at about the same point in time. The data on the four variables • x 1 = CF/TD = (cash flow)/(total debt), • x 2 = NI/TA = (net income)/(Total assets), • x 3 = CA/CL = (current assets)/(current liabilties, and • x 4 = CA/NS = (current assets)/(net sales) are given in the following table.

The data are given in the following table:

Hotelling’s T 2 test A graphical explanation

Hotelling’s T 2 statistic for the two sample problem

is the test statistic for testing:

Hotelling’s T 2 test X 2 Popn A Popn B X 1

X 2 Univariate test for X 1 Popn A Popn B X 1

X 2 Univariate test for X 2 Popn A Popn B X 1

X 2 Univariate test for a 1 X 1 + a 2 X 2 Popn A Popn B X 1

Mahalanobis distance A graphical explanation

Euclidean distance

Mahalanobis distance: S, a covariance matrix

Hotelling’s T 2 statistic for the two sample problem

Case I X 2 Popn A Popn B X 1

Case II X 2 Popn A Popn B X 1

In Case I the Mahalanobis distance between the mean vectors is larger than in Case II, even though the Euclidean distance is smaller. In Case I there is more separation between the two bivariate normal distributions