Hypothesis testing parameter a statement about the value
Hypothesis testing 假設檢定 觀 念 • 母體參數(parameter)為一個描述母體性 質的數量值。 • 假設為關於母體參數有可能的數值的 陳述(a statement about the value or set of values that a parameter or group of parameters can take. ) 社會統計(上) ©蘇國賢 2007 6
The null hypothesis H 0 and the alternative hypothesis Ha 觀 念 • The null hypothesis (虛無假設)is an assumption concerning the value of the population parameter being studied. 對於母體參數值的假定。 • The alternative hypothesis (對立假設)specifies an alternative set of possible values of the population parameter that are not specified in the null hypothesis. 不包含在虛無假設中的母體參數 的可能值。 • The two hypotheses are mutually exclusive. 這兩種 假設為互斥。 社會統計(上) ©蘇國賢 2007 7
Which hypothesis is the null hypothesis? • (1)In many statistical applications, the null hypothesis should correspond to the assumption that no change occurs when some new process or technique is tried. 新的過程或技術沒有造成任何改變的 假設。(如之前的抽水機例子) 社會統計(上) ©蘇國賢 2007 13
Which hypothesis is the null hypothesis? • (2) Some statisticians argue that the null hypothesis should be the hypothesis that the decision maker wants to disprove. 希望被決策 者否定的假設稱為虛無假設。 • That is, the null hypothesis should specify the values of the population parameter that the researcher thinks does not represent the true value(s) of the parameter; the alternative hypothesis then specifies the values of the parameter that the researcher believes do hold. 虛無假設中所假設的母體參數值為研究者認 為不對的母體參數值。 社會統計(上) ©蘇國賢 2007 14
Which hypothesis is the null hypothesis? • (3) another common practice is to assign no special meaning to either the null or the alternative hypothesis, but to let these hypotheses merely represent two different assumptions about the population parameter. 不特別去區分虛無假設及對 立假設的意義,讓兩者各自代表母體 參數的一個可能值。 社會統計(上) ©蘇國賢 2007 15
Consequences of choosing H 0 and Ha 觀 念 • The null hypothesis has the status of a maintained hypothesis that will not be rejected because it is assumed to be true unless the sample data provide strong contrary evidence. • 由於我們只有在證據很充分的情況下才能 推翻虛無假設,因此虛無假設比對立假設 佔據更有利的地位,假設的寫法對於結果 有很大影響。 社會統計(上) ©蘇國賢 2007 20
Decision rules 觀 念 • 推翻null hypothesis的決策法則: • 我們根據檢定統計量(test statistics)來決定 是否推翻虛無假設,如樣本平均數、樣本 比率、Z 或t值等。 • Test statistic: a test statistic is a random variable whose value is used to determine whether we reject the null hypothesis. 社會統計(上) ©蘇國賢 2007 27
Decision rules 觀 念 • Decision rule: The decision rule specifies the set of values of the test statistic for which the null hypothesis H 0 is rejected in favor of Ha and the set of values for which H 0 is accepted (i. e. , not rejected). • 假設檢定中的決策法則為決定推翻或無法 推翻假設的檢定統計量的範圍值。 社會統計(上) ©蘇國賢 2007 28
Rejection Region and Nonrejection Region 觀 念 • 根據決策法則我們可以將檢定統計量分成 窮盡及互斥的兩組數值區域: • The rejection region (拒絕域)of a test, also called the critical region(棄卻域), consists of all values of the test statistic for which H 0 is rejected. • The nonrejection region consists of all values of the test statistic for which H 0 is not rejected. 社會統計(上) ©蘇國賢 2007 34
Critical Value臨界點 觀 念 • The critical value of the test statistic is the value that separates the critical region from the norejection region. 區分拒絕域及不拒 絕域的界線。 • A one-sided alternative hypothesis has one critical value, whereas a two-sided alternative hypothesis has two. 社會統計(上) ©蘇國賢 2007 35
Testing Hypothesis about a Population Mean When Variance is Known 觀 念 • H 0: u = u 0, H 1: u < u 0 • 我們以樣本的平均數X來推論母體的平均數。假 設母體為常態分配,如果H 0為真,則樣本平均數 X~N(u 0, σ2/n)。 • 只有當我們觀察到一個樣本平均數與u 0差異很大 時才會推翻H 0, Reject H 0 if and only if the observed sample mean x-bar is less than the critical value Area α 社會統計(上) ©蘇國賢 2007 σ2/n u 0 43
The critical value • 欲檢證 • H 0: u = u 0 vs. H 1: u < u 0 Area α • 則critical value : 觀 念 1 0 σ2/n Area α 社會統計(上) ©蘇國賢 2007 u 0 47
Testing a Composite Null Hypothesis 觀 念 • 在顯著水準為 =. 01,critical value Z =Z. 01=2. 33. 0 2. 33 Reject the null hypothesis 社會統計(上) ©蘇國賢 2007 54
雙尾檢定A two-tailed test of the 觀 念 population mean • H 0: u = u 0 vs. H 1: u u 0 • 如果觀察到的樣本平均數遠高於或遠低於u 0 則皆可拒絕H 0 Reject H 0 if z<-z /2 or if z>z /2 Reject H 0 if /2 1 - 0 /2 Acceptance region 社會統計(上) ©蘇國賢 2007 55
A two-tailed test of the population mean 觀 念 • 雙尾檢定最常用的顯著水準 為 10%, 5%, 1%, 其相對應的critical z-score為: 1 - /2 0 /2 Acceptance region 社會統計(上) ©蘇國賢 2007 56
A two-tailed test of a mean of a normal population with known variance 例 題 • 求上例中的樣本平均數的critical value? /2=. 025 由於實際觀察到的樣 本平均數 260遠低於 280. 4的臨界點,我們 可以推翻虛擬假設 300 /2 Acceptance region 280. 4 社會統計(上) ©蘇國賢 2007 1 - 319. 6 58
Level of Significance (α risk) 觀 念 • 顯著水準:type I error發生的最大機率值。 • The level of significance of a test is the probability that the test statistic falls in the critical region given that H 0 is true. The level of significance is denoted by the symbol α 社會統計(上) ©蘇國賢 2007 64
Probability of a Type II Error (βrisk) 觀 念 • β 風險:type II error發生的最大機率值。 • The probability of making a Type II error is the probability that the test statistic falls in the acceptance region when the null hypothesis is false, denoted by β 社會統計(上) ©蘇國賢 2007 66
例題 • A sample size of n=100 has been drawn from a population whose variance is 2250 in order to test the following: • H 0: u=1000, H 1: u 1000 • It is decided to reject H 0 if (1) Find the probability of type I error (2) Find the probability of a type II error if u=1005 社會統計(上) ©蘇國賢 2007 75
例題 Find the probability of a type II error if u=1005 社會統計(上) ©蘇國賢 2007 76
P-Value: Interpretation and Use The P-value of a test is the probability of obtaining a value of the test statistics as extreme as or more extreme than the observed sample value when the null hypothesis is true. P-value 告訴我們:「如果虛擬假設為真,我們觀 察到目前資料顯示的檢定統計量的機率有多高? 」如果這個機率很小,則我們可以拒絕虛擬假設, 因為如果假設為真,則僅有很小的機率抽取任意 的隨機樣本會得到目前的觀察值。 社會統計(上) ©蘇國賢 2007 83
Testing the mean of a normal population with population variance known 例 題 1 0 • 求p-value=? Z=-3. 125 • P-value = P(z < -3. 125) =. 0009 如果H 0為真,則觀察到平均值=29. 5的機率僅有. 0009,表示H 0不太可能為真 rejected H 0 社會統計(上) ©蘇國賢 2007 85
Testing the mean of a normal population with population variance known 例 題 1 0 Z=-3. 125 • 將α訂在 5%,則critical value = -1. 645 • 如果將α定在 1%的水準,則critical value = -2. 33, 我們依舊reject H 0 • 只要是顯著水準高於. 0009,我們都可以拒絕H 0, 也就是說,P-value是H 0會被拒絕的最小值。 社會統計(上) ©蘇國賢 2007 86
Testing the mean of a normal population with population variance known 例 題 • It is especially useful to report a p-value when we do not have any specific reason for choosing a particular level of significance or when we have little or no information concerning the costs and consequences of committing a Type I or Type II error. 社會統計(上) ©蘇國賢 2007 88
Testing the mean of a normal population with population variance known 例 題 • 一般在研究報告中,研究者經常直接寫出pvalue而讓讀者自己去決定是否要拒絕H 0。p -value 經常被稱為是觀察到的顯著水準(the observed significant level)。可以將它看成是 「在假設H 0為真的情況下,觀察到目前樣 本,或比此樣本更極端樣本的機率」。 • A statistically significant p-value means that the observed result is difficult to explain by random chance. 社會統計(上) ©蘇國賢 2007 89
Two-tailed test of a population mean using a large sample • Solution: 由於樣本數夠大(n=100)我們可以用一 般檢定母體平均值假設的方法來進行檢證: • =5%, two tailed test, the two critical value z 1= -1. 96, z 2=1. 96 /2=. 25 0 /2 Acceptance region Reject H 0 -1. 96 社會統計(上) ©蘇國賢 2007 1. 96 95
Finding the critical values of X-bar when variance is unknown • 找出前例中x的critical value /2=. 25 /2 Acceptance region 9. 804 社會統計(上) ©蘇國賢 2007 10. 196 96
Characteristics of t distribution • The t distribution is actually a family of distribution with a different density function corresponding to each different value of the parameter . Standard normal (d. f. = ) d. f. =4 d. f. =2 d. f. =1 社會統計(上) ©蘇國賢 2007 99
Value of t , • The symbol t , denotes the value of t such that the area to its right is and t has degree of freedom. The value t , satisfies the equation: • P(t > t , )= • Where the random variable t has the t distribution with degrees of freedom. 社會統計(上) ©蘇國賢 2007 100
Testing Hypothesis about the mean of a normal population with unknown variance • • • 欲檢證下列假設: H 0: u = u 0 or H 0: u u 0 H 1: u > u 0 在顯著水準 之下,找出critical value t , P(t > t , )= 計算t-score: 決策法則:reject H 0 in favor of H 1 if t > t , n-1 社會統計(上) ©蘇國賢 2007 101
Testing Hypothesis about the mean of a normal population with unknown variance • • • 欲檢證下列假設: H 0: u = u 0 or H 0: u u 0 H 1: u < u 0 在顯著水準 之下,找出critical value t , P(t > t , )= 計算t-score: 決策法則:reject H 0 in favor of H 1 if t < -t , n-1 社會統計(上) ©蘇國賢 2007 102
Testing Hypothesis about the mean of a normal population with unknown variance • • 欲檢證下列假設: H 0: u = u 0 H 1: u u 0 在顯著水準 之下,找出critical value t /2, 計算t-score: 決策法則:reject H 0 in favor of H 1 if t < -t /2, n-1 or t > t /2, n-1 社會統計(上) ©蘇國賢 2007 103
例:small-sample test of the mean of a normal population with unknown variance • 解)x 1=245, x 2=305, x 3=175, x 4=250, x 5=280, x 6=160, x 7=250, x 8=195, x 9=210 • 計算樣本平均值: d. f. = 9 -1 = 8, Critical value = t. 01, 8 =2. 896 社會統計(上) ©蘇國賢 2007 105
例:small-sample test of the mean of a normal population with unknown variance 或者我們可以用t = 1. 86找出其相對應的p -value 查表t分配表可知, d. f. =8 P(t > 1. 86) =. 05 若d. f. = 8 , t = 1. 49, P(t > 1. 49) = ? 若d. f. = 8 , t = 1. 16, P(t > 1. 16) = ? 社會統計(上) ©蘇國賢 2007 106
- Slides: 106