Chapter 5 Regression 6192021 Chapter 5 1 Regression
- Slides: 20
Chapter 5 Regression 6/19/2021 Chapter 5 1
Regression • Like correlation, regression addresses the relationship between a quantitative explanatory variable (X) and quantitative response variable (Y) • The objective of regression is to describe the best fitting line through the data • As with correlation, start by looking at the data with a scatterplot 6/19/2021 Chapter 5 2
Same data as last week Country Austria Belgium Finland France Germany Ireland Italy Netherlands Switzerland UK 6/19/2021 Per Capita GDP Life Expectancy X Y 21. 4 23. 2 20. 0 22. 7 20. 8 18. 6 21. 5 22. 0 23. 8 21. 2 Chapter 5 77. 48 77. 53 77. 32 78. 63 77. 17 76. 39 78. 51 78. 15 78. 99 77. 37 3
Inspect scatterplot for linearity 6/19/2021 Chapter 5 4
The Regression Line The regression line predicts values of Y with this equation (the “regression model”): ŷ = a + b∙X where: ŷ ≡ predicted value of Y at given X a ≡ intercept b = slope a and b are called regression coefficients 6/19/2021 Chapter 5 5
Calculation of slope & intercept 6/19/2021 Chapter 5 6
Example: calculation of regression coefficients Last week we calculated: Therefore: ŷ = a + b∙X = 68. 716 + 0. 420∙X 6/19/2021 Chapter 5 7
Regression Coefficients by Calculator This course supports the TI-30 IIS. Other calculators are acceptable but are not supported by the instructor. BEWARE! The TI-30 XIIS mislabels the slope & intercept. The slope is mislabeled as a and the intercept is mislabeled as b. It should be the other way around! 6/19/2021 Chapter 5 8
Interpretation of Slope b • The slope predicts the increase in Y per unit X. • Example: ŷ = 68. 7 + 0. 42∙X • The slope = 0. 42 Each unit increase in X (GDP) is associated with a 0. 420 increase in Y (life expectancy) 6/19/2021 Chapter 5 9
Interpretation: Intercept a • The intercept is where the line would pass through the Y-axis (when X = 0). • Example: ŷ = 68. 7 + 0. 42∙X • The intercept = 68. 7. • We do NOT normally interpolate the intercept 6/19/2021 Chapter 5 10
Regression Line for Prediction • Use regression equation to predict Y given X • Example ŷ = 68. 7 + (0. 420)X • What is the predicted life expectancy in a country with a GDP of 20. 0? ŷ = a + b. X = 68. 7+(0. 420)(20. 0) = 77. 12 6/19/2021 Chapter 5 11
Coefficient of Determination Denoted r 2 (the square r) Interpretation: fraction of the Y “explained” by X Illustration: Our example showed r =. 809. Therefore, r 2 =. 8092 = 0. 66. Interpretation: 66% of the variation in Y (life expectancy) is mathematically “explained” by X (GDP) 6/19/2021 Chapter 5 12
Cautions about regression 1. Linear relationships only (see prior lecture) 2. Influenced by outliers 3. Cannot be extrapolated 4. Association is not equal to causation! (Beware of lurking variables. ) 6/19/2021 Chapter 5 13
Outliers and Influential Points • An outlier is an observation that lies far from the regression line • Outliers in the Y direction have large residuals • Outliers in the X direction are influential 6/19/2021 Chapter 5 14
Example: Influential Outlier Gesell Adaptive Score and “First Word” After removing child 18 Line for all data 6/19/2021 Chapter 5 15
Extrapolation • Extrapolation is the use of the regression equation for predictions outside the range of explanatory variable X • Do NOT extrapolate! • See next slide 6/19/2021 Chapter 5 16
Example: extrapolation (Sarah’s height) • Figure: Sarah’s height from age 36 to 60 months (3 to 5 years) • Regression model: ŷ = 72 +. 4(X) • To predict Sarah’s height at 42 months: ŷ = 72 +. 4(42) = 88. 8 cm ≈ 35” (~ 3’) 6/19/2021 Chapter 5 17
Example: Extrapolation • Do NOT use the regression model to predict Sarah’s height at age 360 months (30 years)! • ŷ = 72 +. 4(X) = 72 +. 4(360) = 216 cm = more than 7’ tall (clearly ridiculous) 6/19/2021 Chapter 5 18
Association does not imply causation Even strong correlations may be non-causal See pp. 144 – 145 for examples! 6/19/2021 Chapter 5 19
Association does not imply causation Criteria to establish causation (pp. 144 – 146): • Strength of relationship • Experimentation • Consistency • Dose-response • Temporality • Plausibility 6/19/2021 Chapter 5 20
- Simple and multiple linear regression
- Multiple regression vs simple regression
- Logistic regression vs linear regression
- Logistic regression vs linear regression
- Chapter 9 regression wisdom
- Chapter 26 inferences for regression
- Chapter 26 inferences for regression
- Worthless regression chapter 16
- Chapter 27 inferences for regression
- Chapter 7 linear regression
- Srf vs prf
- Useless regression chapter 16
- Chapter 8 linear regression
- Chapter 26 inferences for regression
- Chapter 8 linear regression
- Endogeneity problem in regression
- Regression discontinuity
- Charley's regression
- Rsat regression suite automation tool
- Excel
- How to control for confounding