Lecture 18 Simple Linear Regression Chapters 18 118

  • Slides: 10
Download presentation
Lecture 18 • Simple Linear Regression (Chapters 18. 118. 5)

Lecture 18 • Simple Linear Regression (Chapters 18. 118. 5)

Interaction plots in ANOVA • It is a good idea to always look at

Interaction plots in ANOVA • It is a good idea to always look at the interaction plots when doing a two-way ANOVA, regardless of whether or not the test for interactions is significant. Interaction plots display the basic results of study. • If there really are no interactions, then the interaction plots will consist of parallel lines.

Regression Analysis • The goal: Estimate E(Y|X) = conditional mean of Y given X

Regression Analysis • The goal: Estimate E(Y|X) = conditional mean of Y given X based on a sample. • Simple Linear Regression: Assumes E(Y|X) is a straight line in X.

Uses of Regression Analysis • Descriptive. Describe the association between y and x in

Uses of Regression Analysis • Descriptive. Describe the association between y and x in the population observed. • Passive prediction. Predict y based on x where you do not plan to manipulate x, e. g. , predict today’s stock price based on yesterday’s stock price. • Control. Predict what y will be if you change x, e. g. , predict what your earnings will be if you obtain different levels of education.

Lurking Variables • A lurking variable is a variable that has an important effect

Lurking Variables • A lurking variable is a variable that has an important effect on the relationship among the variables in a study but is not included among the variables studied. • Examples: – Y=Salaries of Presbyterian Ministers over time, X=Price of rum in Havana over time, Lurking Variable = Inflation rate over time. – Y=Pellagra rate in village, X=Amount of flies in village, Lurking Variable = Amount of corn in diet.

Pitfalls in Regression Analysis • (1) Descriptive: If using simple linear regression, need to

Pitfalls in Regression Analysis • (1) Descriptive: If using simple linear regression, need to make sure E(Y|X) is actually approximately a straight line. • (2) Passive Prediction: Need to beware of pitfall for (1) plus extrapolation and lurking variables • Control: Need to beware of pitfalls for (1) and (2) plus extra caution about lurking variables. Requires a cause-and-effect relationship. Best found through a controlled experiment.

Example of Pitfall • A researcher measures the number of television sets person X

Example of Pitfall • A researcher measures the number of television sets person X and the average life expectancy Y for the world’s nations. The regression line has a positive slope – nations with many TV sets have higher life expectancies. Could we lengthen the lives of people in Rwanda by shipping them TV sets?

Residual Plots Against Time • Many lurking variables change systematically over time. • Useful

Residual Plots Against Time • Many lurking variables change systematically over time. • Useful method for detecting lurking variables: Plot residuals against time order of observation is available. If a systematic pattern is found, an understanding of the background of the data might allow you to guess what the lurking variables are. • Another useful residual plot: Plot residuals vs. location of observations.

Residual Plot vs. Time Example • Goal: Predict elementary mathematics enrollment (X) at college

Residual Plot vs. Time Example • Goal: Predict elementary mathematics enrollment (X) at college based on number of freshman students (Y). • Linear Fit Math enrollment = -283. 8905 + 0. 6511345 Freshman students

Residual Plot vs. Time • Residual plot suggests that a change took place between

Residual Plot vs. Time • Residual plot suggests that a change took place between 1994 and 1995 that caused higher proportion of students to take math courses. • In fact, one of schools in university changed its program in 1995 to require entering students to take another math course. • Conclusion: The math dept. shouldn’t use data from before 1995 for predicting future enrollment.