# Central Limit Theorem ztests ttests PSY 440 June

• Slides: 100

Central Limit Theorem, z-tests, & t-tests PSY 440 June 19, 2008

New Topic Sample Distributions & The Central Limit Theorem

Distribution of sample means • Distribution of sample means is a “virtual” distribution between the sample and population Population Distribution of sample means Sample

Properties of the distribution of sample means • Shape – If population is Normal, then the dist of sample means Normal – will If thebesample size is large (n > 30), regardless of shape of the population Population Distribution of sample means N > 30

Properties of the distribution of sample means • Center – The mean of the dist of sample means is equal to the mean of the population Population Distribution of sample means same numeric value different conceptual values

Properties of the distribution of sample means • Center – The mean of the dist of sample means is equal to the mean of the population – Consider our earlier example Population Distribution of sample means 2 4 6 8 = 2+4+6+8 4 =5 5 4 3 2 1 2 3 4 5 6 7 8 means = 2+3+4+5+6+4+5+6+7+8 16 =5

Properties of the distribution of sample means • Spread – The standard deviation of the distribution of sample mean depends on two things • Standard deviation of the population • Sample size

Properties of the distribution of sample means • Spread • Standard deviation of the population • The smaller the population variability, the closer the sample means are to the population mean X 3 X 1 X 2

Properties of the distribution of sample means • Spread n=1 • Sample size X

Properties of the distribution of sample means • Spread • Sample size n = 10 X

Properties of the distribution of sample means • Spread • Sample size n = 100 The larger the sample size the smaller the spread X

Properties of the distribution of sample means • Spread • Standard deviation of the population • Sample size – Putting them together we get the standard deviation of the distribution of sample means – Commonly called the standard error

Standard error • The standard error is the average amount that you’d expect a sample (of size n) to deviate from the population mean – In other words, it is an estimate of the error that you’d expect by chance (or by sampling)

Distribution of sample means • Keep your distributions straight by taking care with your notation Population Distribution of sample means Sample s X

Properties of the distribution of sample means • All three of these properties are combined to form the Central Limit Theorem – For any population with mean and standard deviation , the distribution of sample means for sample size n will approach a normal distribution with a mean of and a standard deviation of as n approaches infinity (good approximation if n > 30).

Performing your statistical test • What are we doing when we test the hypotheses? – Computing a test statistic: Generic test Could be difference between a sample and a population, or between different samples Based on standard error or an estimate of the standard error

Hypothesis Testing With a Distribution of Means • It is the comparison distribution when a sample has more than one individual • Find a Z score of your sample’s mean on a distribution of means

“Generic” statistical test An example: One sample z-test Memory experiment example : • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an average score of = 55 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? • Step 1: State your hypotheses H 0: the memory treatment sample are the same (or worse) as the population of memory patients. Treatment > pop = 60 HA: Their memory is better than the population of memory patients Treatment < pop = 60

“Generic” statistical test An example: One sample z-test H 0: Treatment > pop > 60 HA: Treatment < pop < 60 Memory experiment example : • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an average score of = 55 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? • Step 2: Set your decision criteria = 0. 05 One -tailed

“Generic” statistical test An example: One sample z-test H 0: Treatment > pop > 60 HA: Treatment < pop < 60 Memory example experiment: • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an average score of = 55 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? One -tailed • = 0. 05 Step 3: Collect your data

“Generic” statistical test An example: One sample z-test H 0: Treatment > pop = 60 HA: Treatment < pop = 60 Memory example experiment: • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an average score of = 55 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? One -tailed • = 0. 05 Step 4: Compute your test statistics = -2. 5

“Generic” statistical test An example: One sample z-test Memory example experiment: • We give a n = 16 memory patients a memory improvement treatment. H 0: Treatment > pop > 60 HA: Treatment < pop < 60 One -tailed = 0. 05 • Step 5: Make a decision about • After the treatment they have an your null hypothesis average score of = 55 memory errors. • How do they compare to the general population of memory patients who have 5% a distribution of memory errors that is Normal, = 60, = 8? Reject H 0

“Generic” statistical test An example: One sample z-test Memory example experiment: • We give a n = 16 memory patients a memory improvement treatment. • • After the treatment they have an average score of = 55 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60, = 8? H 0: Treatment > pop = 60 HA: Treatment < pop = 60 One -tailed = 0. 05 Step 5: Make a decision about your null hypothesis - Reject H 0 - Support for our HA, the evidence suggests that the treatment decreases the number of memory errors

Other sampling distributions The distribution of sample means is one of the main distributions that underlies inferential statistics, and can be used to test hypotheses about population means (or about the relation between a sample and a population with known parameters). Other distributions are used in a similar manner: • Sampling distribution of differences between means (often conceptualized as having mean=0 and se=sqrt(var 1+var 2) • Sampling distributions of correlation coefficients and of differences between correlation coefficients (correlation coefficients need to be log-transformed to create a sampling distribution that has a normal shape see table I in text for conversion of Pearson’s r into “Fisher Z”).

Effect size and power • Effect size: Cohen’s d • Error types • Statistical Power Analysis

Performing your statistical test Real world (‘truth’) There really isn’t an effect Experimenter’s conclusions H 0 is correct Reject H 0 Fail to Reject H 0 is wrong There really is an effect

Performing your statistical test Real world (‘truth’) H 0 is correct H 0 is wrong

Performing your statistical test Real world (‘truth’) H 0 is correct H 0 is wrong Type I error Type II error Real world (‘truth’) H 0 is correct So there is only one distribution The original (null) distribution H 0 is wrong Type I error Type II error H 0 is wrong So there are two distributions The new (treatment) The original (null) distribution

Performing your statistical test Real world (‘truth’) H 0 is correct So there is only one distribution The original (null) distribution H 0 is wrong Type I error Type II error H 0 is wrong So there are two distributions The new (treatment) The original (null) distribution

Effect Size Real world (‘truth’) H 0 is correct • Hypothesis test tells us whether the observed difference is probably due to chance or not • It does not tell us how big the difference is – Effect size tells us how much the two populations don’t overlap H 0 is wrong Type I error Type II error H 0 is wrong So there are two distributions The new (treatment) The original (null) distribution

Effect Size • Figuring effect size But this is tied to the particular units of measurement The new (treatment) The original (null) distribution – Effect size tells us how much the two populations don’t overlap

Effect Size • Standardized effect size Cohen’s d – Puts into neutral units for comparison (same logic as zscores) The new (treatment) The original (null) distribution – Effect size tells us how much the two populations don’t overlap

Effect Size • Effect size conventions – small – medium – large d =. 2 d =. 5 d =. 8 – Effect size tells us how much the two populations don’t overlap The new (treatment) The original (null) distribution

Error types There really isn’t an effect I conclude that there is an effect Reject H 0 Experimenter’s conclusions Fail to Reject H 0 I can’t detect an effect Real world (‘truth’) H 0 is correct H 0 is wrong There really is an effect

Error types Type I error ( ): concluding that there is a difference between groups (“an effect”) when there really isn’t. Reject H 0 Experimenter’s conclusions Fail to Reject H 0 Real world (‘truth’) H 0 is correct H 0 is wrong Type I error Type II error ( ): concluding that there isn’t an effect, when there really is. Type II error

Statistical Power • The probability of making a Type II error is related to Statistical Power – Statistical Power: The probability that the study will produce a statistically significant results if the research hypothesis is true (there is an effect) • So how do we compute this?

Statistical Power Real world (‘truth’) H 0 is correct H 0 is wrong Real world (‘truth’) H 0: is true (is no treatment effect) Type I error The original (null) distribution Type II error = 0. 05 Reject H 0 Fail to reject H 0

Statistical Power Real world (‘truth’) H 0 is correct H 0 is wrong Real world (‘truth’) H 0: is false (is a treatment effect) Type I error Type II error The new (treatment) distribution The original (null) distribution = 0. 05 Reject H 0 Fail to reject H 0

Statistical Power Real world (‘truth’) H 0 is correct H 0 is wrong Real world (‘truth’) H 0: is false (is a treatment effect) Type I error Type II error The new (treatment) distribution = 0. 05 Reject H 0 The original (null) distribution b = probability of a Type II error Fail to reject H 0 Failing to Reject H 0, even though there is a treatment effect

Statistical Power Real world (‘truth’) H 0 is correct H 0 is wrong Real world (‘truth’) H 0: is false (is a treatment effect) Type I error Type II error The new (treatment) distribution = 0. 05 Power = 1 - b Probability of (correctly) Rejecting H 0 Reject H 0 The original (null) distribution b = probability of a Type II error Fail to reject H 0 Failing to Reject H 0, even though there is a treatment effect

Statistical Power • Steps for figuring power 1) Gather the needed information: mean and standard deviation of the Null Population and the predicted mean of Treatment Population

Statistical Power • Steps for figuring power 2) Figure the raw-score cutoff point on the comparison distribution to reject the null hypothesis From the unit normal = 0. 05 table: Z = -1. 645 Transform this z-score to a raw score

Statistical Power • Steps for figuring power 3) Figure the Z score for this same point, but on the distribution of means for treatment Population Remember to use the properties of the treatment population! Transform this raw score to a z-score

Statistical Power • Steps for figuring power 4) Use the normal curve table to figure the probability of getting a score more extreme than that Z score From the unit normal table: Z(0. 355) = 0. 3594 b = probability of a Type II error Power = 1 - b The probability of detecting an effect of this size from these populations is 64%

Statistical Power Factors that affect Power: – -level – Sample size – Population standard deviation – Effect size – 1 -tail vs. 2 -tailed

Statistical Power Factors that affect Power: -level Change from = 0. 05 to 0. 01 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: -level Change from = 0. 05 to 0. 01 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: -level Change from = 0. 05 to 0. 01 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: -level Change from = 0. 05 to 0. 01 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: -level Change from = 0. 05 to 0. 01 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: -level So as the level gets smaller, so does the Power of the test Change from = 0. 05 to 0. 01 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Sample size Recall that sample size is related to the spread of the distribution Change from n = 25 to 100 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Sample size Change from n = 25 to 100 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Sample size Change from n = 25 to 100 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Sample size Change from n = 25 to 100 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Sample size Change from n = 25 to 100 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0 As the sample gets bigger, the standard error gets smaller and the Power gets larger

Statistical Power Factors that affect Power: Population standard deviation Change from = 25 to 20 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0 Recall that standard error is related to the spread of the distribution

Statistical Power Factors that affect Power: Population standard deviation Change from = 25 to 20 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Population standard deviation Change from = 25 to 20 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Population standard deviation Change from = 25 to 20 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Population standard deviation Change from = 25 to 20 = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0 As the gets smaller, the standard error gets smaller and the Power gets larger

Statistical Power Factors that affect Power: Effect size Compare a small effect (difference) to a big effect = 0. 05 b Power = 1 - b Reject H 0 treatment Fail to reject H 0 no treatment

Statistical Power Factors that affect Power: Effect size Compare a small effect (difference) to a big effect = 0. 05 b Power = 1 - b Reject H 0 treatment Fail to reject H 0 no treatment

Statistical Power Factors that affect Power: Effect size Compare a small effect (difference) to a big effect = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Effect size Compare a small effect (difference) to a big effect = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Effect size Compare a small effect (difference) to a big effect = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Effect size Compare a small effect (difference) to a big effect = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Effect size Compare a small effect (difference) to a big effect = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: Effect size Compare a small effect (difference) to a big effect = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0 As the effect gets bigger, the Power gets larger

Statistical Power Factors that affect Power: 1 -tail vs. 2 -tailed Change from = 0. 05 two-tailed to = 0. 05 twotailed = 0. 05 b Power = 1 - b Reject H 0 Fail to reject H 0

Statistical Power Factors that affect Power: 1 -tail vs. 2 -tailed Change from = 0. 05 two-tailed to = 0. 05 twotailed = 0. 05 p = 0. 025 b Power = 1 - b Reject H 0 p = 0. 025 Fail to reject H 0

Statistical Power Factors that affect Power: 1 -tail vs. 2 -tailed Change from = 0. 05 two-tailed to = 0. 05 twotailed = 0. 05 p = 0. 025 b Power = 1 - b Reject H 0 p = 0. 025 Fail to reject H 0

Statistical Power Factors that affect Power: 1 -tail vs. 2 -tailed Change from = 0. 05 two-tailed to = 0. 05 twotailed = 0. 05 p = 0. 025 b Power = 1 - b Reject H 0 p = 0. 025 Fail to reject H 0

Statistical Power Factors that affect Power: 1 -tail vs. 2 -tailed Change from = 0. 05 two-tailed to = 0. 05 twotailed = 0. 05 p = 0. 025 b Power = 1 - b Reject H 0 p = 0. 025 Fail to reject H 0

Statistical Power Factors that affect Power: 1 -tail vs. 2 -tailed Change from = 0. 05 two-tailed to = 0. 05 twotailed = 0. 05 p = 0. 025 b Power = 1 - b Reject H 0 Two tailed functionally cuts the -level in half, which decreases the power. p = 0. 025 Fail to reject H 0

Statistical Power Factors that affect Power: – -level: So as the level gets smaller, so does the Power of the test – Sample size: As the sample gets bigger, the standard error gets smaller and the Power gets larger – Population standard deviation: As the population standard deviation gets smaller, the standard error gets smaller and the Power gets larger – Effect size: As the effect gets bigger, the Power gets larger – 1 -tail vs. 2 -tailed: Two tailed functionally cuts the -level in half, which decreases the power

Why care about Power? • Determining your sample size – Using an estimate of effect size, and population standard deviation, you can determine how many participants need to achieve a particular level of power (See Table M in text book) • When a result is not statistically significant – Is it because there is no effect, or not enough power? • When a result is significant – Statistical significance versus practical significance

Next Topic • t-tests – One sample, related samples, independent samples – Additional assumptions • Levene’s test

Statistical analysis follows design • The one-sample z-test can be used when: – 1 sample – One score per subject – Population mean ( ) and standard deviation ( )are known

Statistical analysis follows design • The one-sample t-test can be used when: – 1 sample – One score per subject – Population mean ( ) is known – but standard deviation ( ) is NOT known

Testing Hypotheses • Hypothesis testing: a five step program – – Step 1: State your hypotheses Step 2: Set your decision criteria Step 3: Collect your data Step 4: Compute your test statistics • Compute your estimated standard error • Compute your t-statistic • Compute your degrees of freedom – Step 5: Make a decision about your null hypothesis

Performing your statistical test • What are we doing when we test the hypotheses? – Consider a variation of our memory experiment example Memory patients Memory treatment Memory Test X Population of memory patients Memory. Test is known Compare these two means Conclusions: H 0: • the memory treatment sample are the same as those in the population of memory patients. HA: • they aren’t the same as those in the population of memory patients

Performing your statistical test • What are we doing when we test the hypotheses? Real world (‘truth’) H 0: is true (no treatment effect) H 0: is false (is a treatment effect) One population Two populations XA the memory treatment sample are the same as those in the population of memory patients. XA they aren’t the same as those in the population of memory patients

Performing your statistical test • What are we doing when we test the hypotheses? – Computing a test statistic: Generic test Could be difference between a sample and a population, or between different samples Based on standard error or an estimate of the standard error

Performing your statistical test One sample z Test statistic identical One sample t

Performing your statistical test One sample z Test statistic Diff. Expected by chance One sample t different Standard error don’t know this, so need to estimate it

Performing your statistical test One sample z Test statistic Diff. Expected by chance One sample t different Standard error don’t know this, so need to estimate it Estimated standard error Degrees of freedom

One sample t-test • The t-statistic distribution (a transformation of the distribution of sample means) – Varies in shape according to the degrees of freedom New table: the t-table (Table C in text)

One sample t-test • The t-statistic distribution (a transformation of the distribution of sample means) – To reject the H 0, you want a computed test statistic that is large • The alpha level gives us the decision criterion • New table: the t-table Distribution of the t-statistic If test statistic is here Reject H 0 If test statistic is here Fail to reject H 0

One sample t-test levels • New table: the t-table One tailed - or Two-tailed Degrees of freedom df Critical values of t tcrit

One sample t-test • What is the tcrit for a two-tailed hypothesis test with a sample size of n = 6 and an -level of 0. 05? Distribution of the t-statistic = 0. 05 n=6 Two-tailed df = n - 1 = 5 tcrit = +2. 571

One sample t-test • What is the tcrit for a one-tailed hypothesis test with a sample size of n = 6 and an -level of 0. 05? Distribution of the t-statistic = 0. 05 n=6 One-tailed df = n - 1 = 5 tcrit = +2. 015

One sample t-test An example: One sample t-test Memory experiment example: • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an average score of = 55, s = 8 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60? • Step 1: State your hypotheses H 0: the memory treatment sample are the same as those in the population of memory patients. Treatment > pop = 60 HA: they aren’t the same as those in the population of memory patients Treatment < pop = 60

One sample t-test An example: One sample t-test Memory experiment example: H 0: Treatment > pop = 60 HA: Treatment < pop = 60 • We give a n = 16 memory patients a • Step 2: Set your decision memory improvement treatment. criteria • After the treatment they have an = 0. 05 One -tailed average score of = 55, s = 8 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60?

One sample t-test An example: One sample t-test Memory experiment example: H 0: Treatment > pop = 60 HA: Treatment < pop = 60 • We give a n = 16 memory patients a • Step 2: Set your decision memory improvement treatment. criteria • After the treatment they have an = 0. 05 One -tailed average score of = 55, s = 8 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60?

One sample t-test An example: One sample t-test Memory experiment example: • We give a n = 16 memory patients a memory improvement treatment. • • After the treatment they have an average score of = 55, s = 8 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60? H 0: Treatment > pop = 60 HA: Treatment < pop = 60 One -tailed = 0. 05 Step 3: Collect your data

One sample t-test An example: One sample t-test Memory experiment example: • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an • average score of = 55, s = 8 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60? H 0: Treatment > pop = 60 HA: Treatment < pop = 60 One -tailed = 0. 05 Step 4: Compute your test statistics = -2. 5

One sample t-test An example: One sample t-test Memory experiment example: H 0: Treatment > pop = 60 HA: Treatment < pop = 60 • We give a n = 16 memory patients a = 0. 05 One -tailed memory improvement treatment. t = -2. 5 • After the treatment they have an • Step 4: Compute your test average score of = 55, s = 8 memory statistics errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60?

One sample t-test An example: One sample t-test Memory experiment example: • We give a n = 16 memory patients a memory improvement treatment. • After the treatment they have an • average score of = 55, s = 8 memory errors. • How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, = 60? H 0: Treatment > pop = 60 HA: Treatment < pop = 60 One -tailed = 0. 05 Step 5: Make a decision about your null hypothesis tcrit = -1. 753

One sample t-test An example: One sample t-test Memory experiment example: H 0: Treatment > pop = 60 HA: Treatment < pop = 60 • We give a n = 16 memory patients a = 0. 05 One -tailed memory improvement treatment. • After the treatment they have an • Step 5: Make a decision about average score of = 55, s = 8 memory your null hypothesis errors. • How do they compare to the general t =-2. 5 population of memory patients who have obs a distribution of memory errors that is - Reject H 0 Normal, = 60? -1. 753 = tcrit