STAT 101 Dr Kari Lock Morgan 10912 Exam

  • Slides: 26
Download presentation
STAT 101 Dr. Kari Lock Morgan 10/9/12 Exam 1 Review of Chapters 1 -4

STAT 101 Dr. Kari Lock Morgan 10/9/12 Exam 1 Review of Chapters 1 -4 Statistics: Unlocking the Power of Data Lock 5

Office Hours This Week Tuesday • Tracy 5 – 7 pm, Old Chem 211

Office Hours This Week Tuesday • Tracy 5 – 7 pm, Old Chem 211 A • Wednesday • Kari 11: 30 – 12: 30 pm, Old Chem 216 • Tracy 4: 30 – 5: 30 pm, Old Chem 211 A • Heather 8 – 9 pm, Old Chem 211 A • Thursday • Kari 1 – 2: 30 pm, Old Chem 216 • Statistics: Unlocking the Power of Data Lock 5

The Big Picture Population Sampling Sample Statistical Inference Statistics: Unlocking the Power of Data

The Big Picture Population Sampling Sample Statistical Inference Statistics: Unlocking the Power of Data Descriptive statistics Lock 5

Cases and Variables We obtain information about cases or units. A variable is any

Cases and Variables We obtain information about cases or units. A variable is any characteristic that is recorded for each case. �Generally each case makes up a row in a dataset, and each variable makes up a column Statistics: Unlocking the Power of Data Lock 5

Sampling Population Sample GOAL: Select a sample that is similar to the population, only

Sampling Population Sample GOAL: Select a sample that is similar to the population, only smaller Statistics: Unlocking the Power of Data Lock 5

Observational Studies �There almost always confounding variables in observational studies Observational studies can used

Observational Studies �There almost always confounding variables in observational studies Observational studies can used to establish causation never be used to establish causation almost never be used to establish causation Observational studies cannever almost �Observational studies can almost be Statistics: Unlocking the Power of Data Lock 5

Randomized Experiments �Because the explanatory variable is randomly assigned, it is not associated with

Randomized Experiments �Because the explanatory variable is randomly assigned, it is not associated with any other variables. Confounding variables are eliminated!!! Confounding Variable RANDOMIZED EXPERIMENT Explanatory Variable Statistics: Unlocking the Power of Data Response Variable Lock 5

Data Collection Was the sample randomly selected? Yes No Possible to generalize to the

Data Collection Was the sample randomly selected? Yes No Possible to generalize to the population Should not generalize to the population Statistics: Unlocking the Power of Data Was the explanatory variable randomly assigned? Yes Possible to make conclusions about causality No Can not make conclusions about causality Lock 5

Variable(s) Visualization Summary Statistics Categorical bar chart, pie chart frequency table, relative frequency table,

Variable(s) Visualization Summary Statistics Categorical bar chart, pie chart frequency table, relative frequency table, proportion Quantitative dotplot, histogram, boxplot mean, median, max, min, standard deviation, z -score, range, IQR, five number summary Categorical vs Categorical side-by-side bar chart, two-way table, difference segmented bar chart in proportions Quantitative vs Categorical side-by-side boxplots statistics by group, difference in means Quantitative vs Quantitative scatterplot correlation Statistics: Unlocking the Power of Data Lock 5

Statistic vs Parameter • A sample statistic is a number computed from sample data.

Statistic vs Parameter • A sample statistic is a number computed from sample data. • A population parameter is a number that describes some aspect of a population Statistics: Unlocking the Power of Data Lock 5

Sampling Distribution • A sampling distribution is the distribution of statistics computed for different

Sampling Distribution • A sampling distribution is the distribution of statistics computed for different samples of the same size taken from the same population • The spread of the sampling distribution helps us to assess the uncertainty in the sample statistic • In real life, we rarely get to see the sampling distribution – we usually only have one sample Statistics: Unlocking the Power of Data Lock 5

Bootstrap • A bootstrap sample is a random sample taken with replacement from the

Bootstrap • A bootstrap sample is a random sample taken with replacement from the original sample, of the same size as the original sample • A bootstrap statistic is the statistic computed on the bootstrap sample • A bootstrap distribution is the distribution of many bootstrap statistics Statistics: Unlocking the Power of Data Lock 5

Original Sample Statistic Bootstrap Sample Bootstrap Statistic . . . Bootstrap Sample Statistics: Unlocking

Original Sample Statistic Bootstrap Sample Bootstrap Statistic . . . Bootstrap Sample Statistics: Unlocking the Power of Data . . . Bootstrap Distribution Bootstrap Statistic Lock 5

Confidence Interval • A confidence interval for a parameter is an interval computed from

Confidence Interval • A confidence interval for a parameter is an interval computed from sample data by a method that will capture the parameter for a specified proportion of all samples • A 95% confidence interval will contain the true parameter for 95% of all samples Statistics: Unlocking the Power of Data Lock 5

Standard Error • The standard error (SE) is the standard deviation of the sample

Standard Error • The standard error (SE) is the standard deviation of the sample statistic • The SE can be estimated by the standard deviation of the bootstrap distribution • For symmetric, bell-shaped distributions, a 95% confidence interval is Statistics: Unlocking the Power of Data Lock 5

Percentile Method • If the bootstrap distribution is approximately symmetric, a P% confidence interval

Percentile Method • If the bootstrap distribution is approximately symmetric, a P% confidence interval can be gotten by taking the middle P% of a bootstrap distribution Statistics: Unlocking the Power of Data Lock 5

Bootstrap Distribution Statistics: Unlocking the Power of Data Lock 5

Bootstrap Distribution Statistics: Unlocking the Power of Data Lock 5

Hypothesis Testing • How unusual would it be to get results as extreme (or

Hypothesis Testing • How unusual would it be to get results as extreme (or more extreme) than those observed, if the null hypothesis is true? • If it would be very unusual, then the null hypothesis is probably not true! • If it would not be very unusual, then there is not evidence against the null hypothesis Statistics: Unlocking the Power of Data Lock 5

p-value • The p-value is the probability of getting a statistic as extreme (or

p-value • The p-value is the probability of getting a statistic as extreme (or more extreme) as that observed, just by random chance, if the null hypothesis is true • The p-value measures evidence against the null hypothesis Statistics: Unlocking the Power of Data Lock 5

Hypothesis Testing Statistics: Unlocking the Power of Data Lock 5

Hypothesis Testing Statistics: Unlocking the Power of Data Lock 5

Randomization Distribution • A randomization distribution is the distribution of sample statistics we would

Randomization Distribution • A randomization distribution is the distribution of sample statistics we would observe, just by random chance, if the null hypothesis were true • The p-value is calculated by finding the proportion of statistics in the randomization distribution that fall beyond the observed statistic Statistics: Unlocking the Power of Data Lock 5

Statistical Conclusions Strength of evidence against H 0: Formal decision of hypothesis test, based

Statistical Conclusions Strength of evidence against H 0: Formal decision of hypothesis test, based on = 0. 05 : Statistics: Unlocking the Power of Data Lock 5

Formal Decisions For a given significance level, , p-value < Reject Ho p-value >

Formal Decisions For a given significance level, , p-value < Reject Ho p-value > Do not Reject Ho Statistics: Unlocking the Power of Data Lock 5

Formal Decisions “If the p-value is low, the ho must go” Statistics: Unlocking the

Formal Decisions “If the p-value is low, the ho must go” Statistics: Unlocking the Power of Data Lock 5

Errors Decision Reject H 0 Truth H 0 true TYPE I ERROR H 0

Errors Decision Reject H 0 Truth H 0 true TYPE I ERROR H 0 false Statistics: Unlocking the Power of Data Do not reject H 0 TYPE II ERROR Lock 5

QUESTIONS? ? ? Statistics: Unlocking the Power of Data Lock 5

QUESTIONS? ? ? Statistics: Unlocking the Power of Data Lock 5