11 Comparison of Two Means Tests involving two

  • Slides: 14
Download presentation

11 Comparison of Two Means

11 Comparison of Two Means

Tests involving two samples – comparing variances, F distribution • TOH - x. A

Tests involving two samples – comparing variances, F distribution • TOH - x. A = x. B ? • Step 1 - F-test s. A 2 = s. B 2 ? • Step 2 - t-test use different formula for (i) s. A 2 = s. B 2. (ii) s. A 2 ≠s. B 2 • Goal – whether a given gene is expressed differently between patients and healthy subjects • This involves comparing the mean of the two samples • To answer this question one must first know whether the two samples have the same variance • The method used to compare variances of two samples – F distribution • Then we use t-test to test whether the mean of the gene is expressed differently between patients and healthy subjects

Tests involving two samples – comparing variances, F distribution • The values measured in

Tests involving two samples – comparing variances, F distribution • The values measured in controls are: 10, 11, 12, 15, 13, 12 • The values measured in patients are: 12, 13, 15, 12, 18, 17, 16, 12, 15, 10, 12. Is the variance different between the controls and the patients at a 5% significant level ? • H 0: s. A 2 = s. B 2, H 1: s. A 2 ≠s. B 2 • Need to find a new test statistics, • Two-tail test • Notation: assume A = controls, B = patients in the following calculation • Controls sample A has d. o. f and variance = 6 and 2. 66 • Patients sample B has d. o. f and variance = 12 and 5. 74 • Consider the ratio F = 2. 66/5. 74 = 0. 4634, • Significant level for two-tail test = 5%/2 = 2. 5% • F-distribution (right tail) F 0. 025(6, 12) = 3. 7283 (from Excel) • F 0. 975(6, 12) = 0. 1864 (from Excel) F- distribution (right tail) http: //mips. stanford. edu/public/classes/stats_data_analysis/234_99. html

F distribution – right tail 0. 025 see next page

F distribution – right tail 0. 025 see next page

Tests involving two samples – comparing variances, F distribution • F 0. 025(6, 12)

Tests involving two samples – comparing variances, F distribution • F 0. 025(6, 12) = 3. 7283

Tests involving two samples – comparing variances, F-distribution • Usually we have F-distribution table

Tests involving two samples – comparing variances, F-distribution • Usually we have F-distribution table for 0. 01, 0. 025, 0. 05 but not 0. 975 !! • Given F 0. 025(6, 12) = 3. 7283, how to find F 0. 975(6, 12) ? ? ? • The F distribution has the interesting property that : • left tail for an F with n 1 and n 2 d. o. f. is = the reciprocal of the right tail for an F with the d. o. f reversed: • F[Left tail(n. A, n. B)]a = 1/F[right tail(n. B, n. A)]1 -a • • • F 0. 975(6, 12) = 1/ F(1 -0. 975)(12, 6) F 0. 975(6, 12) = 1/ F 0. 025(12, 6) = 1/5. 3662 = 0. 18635 back to our null hypothesis test Since 0. 18635 < 0. 4634 < 3. 7283 Since the F-statistics is in between 0. 18635 and 3. 7283, we will accept the null hypothesis there is no difference between controls and patients

Tests involving two samples – comparing variances, F-distribution • • Now, let us consider

Tests involving two samples – comparing variances, F-distribution • • Now, let us consider the ratio The two different choices should lead to same conclusion, since the conclusion should not depend which variance we put on the numerator or denominator • Controls sample A has d. o. f and variance = 6 and 2. 66 • Patients sample B has d. o. f and variance = 12 and 5. 74 • F = 5. 74/2. 66 = 2. 1579 • F-distribution (right tail) F 0. 025(12, 6) = 5. 3662 (from Excel) • F 0. 975(12, 6) = 0. 2682 (from Excel) • Since 0. 2682 < 2. 1579 < 5. 3662 • Since the F-statistics is in between 0. 2682 and 5. 366, we will accept the null hypothesis there is no difference between controls and patients REMARK • The two F-tests are reciprocal to each other • That is 0. 18635 < 0. 4634 < 3. 7283 • Reciprocal 1/0. 18635 > 1/0. 4634 >1/3. 7283 • 5. 3662 > 2. 1579 > 0. 2682

Tests involving two samples – comparing means The gene expression level of the gene

Tests involving two samples – comparing means The gene expression level of the gene AC 002378 is measured for the patients, P and controls, C are given in the following: gene. ID P 1 P 2 P 3 P 4 P 5 P 6 AC 002378 0. 66 0. 51 1. 12 0. 83 0. 91 0. 50 gene. ID C 1 C 2 C 3 C 4 C 5 C 6 AC 002378 0. 41 0. 57 -0. 17 0. 50 0. 22 0. 71 • F-test: H 0: s. P 2 = s. C 2, H 1: s. P 2 ≠s. C 2 • T-test: H 0: x. P = x. C, H 1: x. P ≠ x. C • Mean of gene expression level of patients, XP = 0. 755 • Mean of gene expression level of controls, XC = 0. 373 • s. P 2 = 0. 059, s. C 2 = 0. 097 • To test whether the two samples have the same variance or not, we perform the F-test at a 5% level • F = 0. 059/0. 097 = 0. 60, d. o. f. = 10 • F 0. 025(5, 5) = 7. 146, F 0. 975(5, 5) = 0. 1399 • In between 0. 1399 and 7. 146 accept the null hypothesis the patients and controls have the same variances

Tests involving two samples – comparing means • t-statistic of two independent samples with

Tests involving two samples – comparing means • t-statistic of two independent samples with equal variances • The t-score is where • the p-value, or the probability of having such a value by chance is 0. 0400. This value is smaller than the significant level 0. 05, and therefore we reject the null hypothesis, the gene AC 002378 is expressed differently between cancer patients and healthy subjects.

Tests involving two samples – comparing means • t-statistic of two independent samples with

Tests involving two samples – comparing means • t-statistic of two independent samples with unequal variances • The modified t-score is • The degree of freedom n need to be adjusted as • This value is not an integer and needs to be rounded down

Chapter 11 p 259

Chapter 11 p 259

Chapter 11 p 264

Chapter 11 p 264

Chapter 11 p 2268

Chapter 11 p 2268