STAT 250 Dr Kari Lock Morgan Confidence Intervals
- Slides: 45
STAT 250 Dr. Kari Lock Morgan Confidence Intervals: Bootstrap Distribution SECTIONS 3. 3, 3. 4 • Bootstrap distribution (3. 3) • 95% CI using standard error (3. 3) • Percentile method (3. 4) Statistics: Unlocking the Power of Data Lock 5
Confidence Intervals Confidence Interval Sample Population statistic ± ME Sample . . . Sample Margin of Error (ME) (95% CI: ME = 2×SE) Sampling Distribution Calculate statistic for each sample Statistics: Unlocking the Power of Data Standard Error (SE): standard deviation of sampling distribution Lock 5
Reality One small problem… … WE ONLY HAVE ONE SAMPLE!!!! • How do we know how much sample statistics vary, if we only have one sample? !? BOOTSTRAP! Statistics: Unlocking the Power of Data Lock 5
Your best guess for the population is lots of copies of the sample! Sample repeatedly from this “population” Statistics: Unlocking the Power of Data Lock 5
Sampling with Replacement • It’s impossible to sample repeatedly from the population… • But we can sample repeatedly from the sample! • To get statistics that vary, sample with replacement (each unit can be selected more than once) Statistics: Unlocking the Power of Data Lock 5
Suppose we have a random sample of 6 people: Statistics: Unlocking the Power of Data Lock 5
Original Sample A simulated “population” to sample from Statistics: Unlocking the Power of Data Lock 5
Bootstrap Sample: Sample with replacement from the original sample, using Remember: sample the sample size matters! Original Sample Statistics: Unlocking the Power of Data Bootstrap Sample Lock 5
Reese’s Pieces • How would you take a bootstrap sample from your sample of Reese’s Pieces? Statistics: Unlocking the Power of Data Lock 5
Bootstrap Sample Your original sample has data values 18, 19, 20, 21 Is the following a possible bootstrap sample? 18, 19, 20, 21, 22 a) Yes b) No Statistics: Unlocking the Power of Data Lock 5
Bootstrap Sample Your original sample has data values 18, 19, 20, 21 Is the following a possible bootstrap sample? 18, 19, 20, 21 a) Yes b) No Statistics: Unlocking the Power of Data Lock 5
Bootstrap Sample Your original sample has data values 18, 19, 20, 21 Is the following a possible bootstrap sample? 18, 19, 20, 21 a) Yes b) No Statistics: Unlocking the Power of Data Lock 5
Original Sample Statistic Bootstrap Sample Bootstrap Statistic . . . Bootstrap Sample Statistics: Unlocking the Power of Data . . . Bootstrap Distribution Bootstrap Statistic Lock 5
Mercury Levels in Fish Lange, T. , Royals, H. and Connor, L. (2004). Mercury accumulation in largemouth bass (Micropterus salmoides) in a Florida Lake. Archives of Environmental Contamination and Toxicology, 27(4), 466 -471. Statistics: Unlocking the Power of Data Lock 5
Mercury Levels in Fish Statistics: Unlocking the Power of Data Lock 5
Bootstrap Sample You have a sample of size n = 53. You sample with replacement 1000 times to get 1000 bootstrap samples. What is the sample size of each bootstrap sample? (a) 53 (b) 1000 Statistics: Unlocking the Power of Data Lock 5
Bootstrap Distribution You have a sample of size n = 53. You sample with replacement 1000 times to get 1000 bootstrap samples. How many bootstrap statistics will you have? (a) 53 (b) 1000 Statistics: Unlocking the Power of Data Lock 5
Why “bootstrap”? “Pull yourself up by your bootstraps” • Lift yourself in the air simply by pulling up on the laces of your boots • Metaphor for accomplishing an “impossible” task without any outside help Statistics: Unlocking the Power of Data Lock 5
Sampling Distribution BUT, in practice we don’t see the “tree” or all of the “seeds” – we only have ONE seed Population µ Statistics: Unlocking the Power of Data Lock 5
Bootstrap Distribution What can we do with just one seed? Bootstrap “Population” Statistics: Unlocking the Power of Data µ Lock 5
Golden Rule of Bootstrapping Bootstrap statistics are to the original sample statistic as the original sample statistic is to the population parameter Statistics: Unlocking the Power of Data Lock 5
Center • The sampling distribution is centered around the population parameter • The bootstrap distribution is centered around the a) population parameter b) sample statistic c) bootstrap statistic d) bootstrap parameter • Luckily, we don’t care about the center… we care about the variability! Statistics: Unlocking the Power of Data Lock 5
Standard Error • The variability of the bootstrap statistics is similar to the variability of the sample statistics • The standard error of a statistic can be estimated using the standard deviation of the bootstrap distribution! Statistics: Unlocking the Power of Data Lock 5
Confidence Intervals Confidence Interval Bootstrap Sample statistic ± ME Bootstrap Sample . . . Bootstrap Sample Margin of Error (ME) (95% CI: ME = 2×SE) Bootstrap Distribution Calculate statistic for each bootstrap sample Statistics: Unlocking the Power of Data Standard Error (SE): standard deviation of bootstrap distribution Lock 5
Mercury Levels in Fish Lange, T. , Royals, H. and Connor, L. (2004). Mercury accumulation in largemouth bass (Micropterus salmoides) in a Florida Lake. Archives of Environmental Contamination and Toxicology, 27(4), 466 -471. Statistics: Unlocking the Power of Data Lock 5
Mercury Levels in Fish SE = 0. 047 0. 527 2 0. 047 (0. 433, 0. 621) We are 95% confident that average mercury level in fish in Florida lakes is between 0. 433 and 0. 621 ppm. Statistics: Unlocking the Power of Data Lock 5
Same process for every parameter! Generate samples with replacement Calculate sample statistic Repeat. . . Statistics: Unlocking the Power of Data Lock 5
Hitchhiker Snails �A type of small snail is very widespread in Japan, and colonies of the snails that are genetically very similar have been found very far apart. �How could the snails travel such long distances? �Biologist Shinichiro Wada fed 174 live snails to birds, and found that 26 were excreted live out the other end. (The snails are apparently able to seal their shells shut to keep the digestive fluids from getting in). �What proportion of these snails ingested by birds survive? Yong, E. “The Scatological Hitchhiker Snail, ” Discover, October 2011, 13. Statistics: Unlocking the Power of Data Lock 5
Hitchhiker Snails Give a 95% confidence interval for the proportion of snails ingested by birds that survive. a) (0. 1, 0. 2) b) (0. 05, 0. 25) c) (0. 12, 0. 18) d) (0. 07, 0. 18) Statistics: Unlocking the Power of Data Lock 5
Body Temperature What is the average body temperature of humans? www. lock 5 stat. com/statkey Shoemaker, What's Normal: Temperature, Gender and Heartrate, Journal of Statistics Education, Vol. 4, No. 2 (1996) Statistics: Unlocking the Power of Data Lock 5
Other Levels of Confidence • What if we want to be more than 95% confident? • How might you produce a 99% confidence interval for the average body temperature? Statistics: Unlocking the Power of Data Lock 5
Percentile Method • For a P% confidence interval, keep the middle P% of bootstrap statistics • For a 99% confidence interval, keep the middle 99%, leaving 0. 5% in each tail. • The 99% confidence interval would be (0. 5 th percentile, 99. 5 th percentile) where the percentiles refer to the bootstrap distribution. Statistics: Unlocking the Power of Data Lock 5
Bootstrap Distribution • For a P% confidence interval: Statistics: Unlocking the Power of Data Lock 5
Body Temperature www. lock 5 stat. com/statkey We are 99% sure that the average body temperature is between 98. 00 and 98. 58 Statistics: Unlocking the Power of Data Lock 5
Level of Confidence Which is wider, a 90% confidence interval or a 95% confidence interval? (a) 90% CI (b) 95% CI Statistics: Unlocking the Power of Data Lock 5
Mercury and p. H in Lakes • For Florida lakes, what is the correlation between average mercury level (ppm) in fish taken from a lake and acidity (p. H) of the lake? Give a 90% CI for Lange, Royals, and Connor, Transactions of the American Fisheries Society (1993) Statistics: Unlocking the Power of Data Lock 5
Mercury and p. H in Lakes www. lock 5 stat. com/statkey We are 90% confident that the true correlation between average mercury level and p. H of Florida lakes is between -0. 702 and -0. 433. Statistics: Unlocking the Power of Data Lock 5
Bootstrap CI Option 1: Estimate the standard error of the statistic by computing the standard deviation of the bootstrap distribution, and then generate a 95% confidence interval by Option 2: Generate a P% confidence interval as the range for the middle P% of bootstrap statistics Statistics: Unlocking the Power of Data Lock 5
Mercury Levels in Fish SE = 0. 047 0. 527 2 0. 047 (0. 433, 0. 621) Middle 95% of bootstrap statistics Statistics: Unlocking the Power of Data Lock 5
Bootstrap Cautions • These methods for creating a confidence interval only work if the bootstrap distribution is smooth and symmetric • ALWAYS look at a plot of the bootstrap distribution! • If the bootstrap distribution is skewed or looks “spiky” with gaps, you will need to go beyond intro stat to create a confidence interval Statistics: Unlocking the Power of Data Lock 5
Bootstrap Cautions Statistics: Unlocking the Power of Data Lock 5
Bootstrap Cautions Statistics: Unlocking the Power of Data Lock 5
Number of Bootstrap Samples • When using bootstrapping, you may get a slightly different confidence interval each time. This is fine! • The more bootstrap samples you use, the more precise your answer will be. • For the purposes of this class, 1000 bootstrap samples is fine. In real life, you probably want to take 10, 000 or even 100, 000 bootstrap samples Statistics: Unlocking the Power of Data Lock 5
Summary �The standard error of a statistic is the standard deviation of the sample statistic, which can be estimated from a bootstrap distribution �Confidence intervals can be created using the standard error or the percentiles of a bootstrap distribution �Increasing the number of bootstrap samples will not change the SE or interval (except for random fluctuation) �Confidence intervals can be created this way for any parameter, as long as the bootstrap distribution is approximately symmetric and continuous Statistics: Unlocking the Power of Data Lock 5
To Do �Read Sections 3. 3, 3. 4 �Do HW 3. 3, 3. 4 (due Friday, 2/7) Statistics: Unlocking the Power of Data Lock 5
- Kari lock morgan
- Kari lock morgan
- Kari lock morgan
- Kari lock morgan
- Chapter 19: confidence intervals for proportions
- Confidence interval definition
- Common critical values
- Ci value
- Confident
- Chapter 19 confidence intervals for proportions
- How to find confidence interval on ti 84
- Critical value for z score
- Confidence interval excel
- Minitab confidence interval
- Reporting confidence intervals
- Chapter 18 confidence intervals for proportions
- How to add 95 confidence intervals in excel
- Statkey lock
- Statkey lock
- Lock 5 stat
- Confidence interval vs confidence level
- Statistical intervals for a single sample
- How to do error intervals
- End behavior of a quadratic function
- Removable and nonremovable discontinuity
- Histogram with unequal width
- Time 24 hour format
- What are error intervals
- Truncation error in taylor series
- Increasing decreasing constant
- Monotone intervals
- Error intervals
- How to find increasing and decreasing intervals on a graph
- Chapter 20 more about tests and intervals
- Increasing decreasing constant
- Ilocano instruments
- What is an error interval
- Ecg small square time
- Short pr intervals
- Kari
- Kari røsand
- Kari is testing the hypothesis
- Kari tammi aalto
- Kari halonen
- Kari sallamaa
- Kari kurvinen