Least Squares Regression Line Lesson 3 2 skewthescript







































































































































- Slides: 135
Least Squares Regression Line - Lesson 3. 2 skewthescript. org
skewthescript. org
Do schools in America effectively equalize opportunity for low‐ income students? skewthescript. org
Today’s Key Analysis Will raising attendance also raise scores for low incomes students? skewthescript. org
Lesson 3. 2 Guided Notes Handout: skewthescript. org/3 -2 skewthescript. org
Topics 1. 2. 3. 4. Least squares regression line (LSRL) Slope and y‐intercept Predictions using the LSRL Dangers of prediction skewthescript. org
Topics 1. 2. 3. 4. Least squares regression line (LSRL) Slope and y‐intercept Predictions using the LSRL Dangers of prediction skewthescript. org
The Income Achievement Gap Nationally, low income students perform worse on math exams than middle/upper class students. Source: NAEP (National Assessment of Education Progress) Long‐Term Trend Mathematics scores. Image scale based on NAEP online chart tool. According to NAEP, differences are statistically significant. skewthescript. org
The Income Attendance Gap NYC: High Poverty Areas Have Lower Attendance Rates Nationally, low income areas have more chronically absent students. Graphic from Nauer et al. , “A Better Picture of Poverty, ” Center for New York City Affairs, Nov. 2014: https: //www. attendanceworks. org/wp‐content/uploads/2017/06/Better. Pictureof. Poverty_PA_FINAL_001. pdf skewthescript. org
Correlation? ! skewthescript. org
skewthescript. org
skewthescript. org
Is attendance the solution? Poverty Low Attendance Low Scores skewthescript. org
Is attendance the solution? Poverty High Low Attendance Low Scores skewthescript. org
Is attendance the solution? Poverty High Low Attendance High Low Scores skewthescript. org
Is attendance the solution? Poverty High Will raising attendance Low Attendance also raise scores? High Low Scores skewthescript. org
STAAR Percent Algebra 1 Attendance Raw Score 95 89 67 98 99 76 92 91 76 85 82 45 42 31 51 49 38 46 41 35 39 37 Linear Regression Random sample of students who took the Texas STAAR state Algebra 1 assessment and their attendance rate during the school year. skewthescript. org
explains responds STAAR Percent Algebra 1 Attendance Raw Score 95 89 67 98 99 76 92 91 76 85 82 45 42 31 51 49 38 46 41 35 39 37 Linear Regression Random sample of students who took the Texas STAAR state Algebra 1 assessment and their attendance rate during the school year. skewthescript. org
explains responds STAAR Percent Algebra 1 Attendance Raw Score (x) (y) 95 89 67 98 99 76 92 91 76 85 82 45 42 31 51 49 38 46 41 35 39 37 Linear Regression Random sample of students who took the Texas STAAR state Algebra 1 assessment and their attendance rate during the school year. Explanatory (x): Attendance Response (y): Test Scores skewthescript. org
skewthescript. org
D: Positive O: None F: Linear S: Strong skewthescript. org
Superintendent: “If a student meets our minimum attendance goal (87%), what would their predicted test score be? ” D: Positive O: None F: Linear S: Strong skewthescript. org
skewthescript. org
How do we find the least squares regression line (LSRL)? skewthescript. org
Least Squares Regression skewthescript. org
Least Squares Regression Model A skewthescript. org
Least Squares Regression Model B skewthescript. org
Least Squares Regression Model A skewthescript. org
Least Squares Regression Model B skewthescript. org
Least Squares Regression Model A skewthescript. org
Least Squares Regression Model B skewthescript. org
Model A Model B skewthescript. org
Model A Which model fits better? Model B skewthescript. org
Residual skewthescript. org
(7, 5) Residual skewthescript. org
(7, 5) (7, 3) (7, 5) Residual skewthescript. org
(7, 5) (7, 3) (7, 5) (7, 4) Residual skewthescript. org
(7, 5) (7, 3) (7, 5) (7, 4) Residual skewthescript. org
(7, 5) (7, 3) (7, 5) (7, 4) skewthescript. org
(7, 5) (7, 3) (7, 5) (7, 4) skewthescript. org
(7, 5) (7, 3) (7, 5) (7, 4) skewthescript. org
(9, 4) (9, 3) skewthescript. org
(9, 4) (9, 3) (9, 7) (9, 3) skewthescript. org
(9, 4) (9, 3) (9, 7) (9, 3) skewthescript. org
(9, 4) (9, 3) (9, 7) (9, 3) skewthescript. org
(9, 4) (9, 3) (9, 7) (9, 3) skewthescript. org
First thought: The line of best fit is the line that minimizes the sum of the residuals. skewthescript. org
skewthescript. org
skewthescript. org
Sum of residuals: 0 + 0 = 0 skewthescript. org
Sum of residuals: 0 + 0 = 0 Good fit. skewthescript. org
skewthescript. org
-2 +2 0 skewthescript. org
-2 +2 0 Sum of residuals: 2 + 0 + (‐ 2) = 0 skewthescript. org
-2 +2 0 Sum of residuals: 2 + 0 + (‐ 2) = 0 Good fit? ! skewthescript. org
-2 +2 0 skewthescript. org
skewthescript. org
skewthescript. org
skewthescript. org
skewthescript. org
skewthescript. org
skewthescript. org
Sum of squared residuals 4 + 0 + 4 = 8 Not a great fit Sum of squared residuals 0 + 0 = 0 Great fit! skewthescript. org
Least Squares Regression Line (LSRL): a linear model that minimizes the sum of the square residuals between the data and the model. • Also called “line of best fit” skewthescript. org
How do we find the least squares regression line (LSRL)? skewthescript. org
skewthescript. org
skewthescript. org
skewthescript. org
Calculator steps in later lesson! skewthescript. org
Topics 1. 2. 3. 4. Least squares regression line (LSRL) Slope and y-intercept Predictions using the LSRL Dangers of prediction skewthescript. org
Every mathematician’s favorite one‐liner: y = mx + b skewthescript. org
y = mx + b y = 2 x + 1 Slope? What does it mean? Y intercept? What does it mean? skewthescript. org
y = mx + b y = 2 x + 1 Slope? What does it mean? +1 Y intercept? What does it mean? skewthescript. org
y = mx + b +1 +2 y = 2 x + 1 Slope? What does it mean? Y intercept? What does it mean? skewthescript. org
y = mx + b +1 +2 y = 2 x + 1 Slope? What does it mean? Y intercept? What does it mean? skewthescript. org
y = mx + b +1 +2 y = 2 x + 1 Slope? What does it mean? Slope: 2 For every increase in x by 1 unit, the y values increase by 2. Y intercept? What does it mean? skewthescript. org
y = mx + b y-int y = 2 x + 1 Slope? What does it mean? Slope: 2 For every increase in x by 1 unit, the y values increase by 2. Y intercept? What does it mean? skewthescript. org
y = mx + b y-int y = 2 x + 1 Slope? What does it mean? Slope: 2 For every increase in x by 1 unit, the y values increase by 2. Y intercept? What does it mean? Y-intercept: 1 When x = 0, the y‐value is 1 skewthescript. org
skewthescript. org
skewthescript. org
Slope skewthescript. org
Slope y-intercept skewthescript. org
y‐value predicted y‐value Slope y-intercept skewthescript. org
skewthescript. org
skewthescript. org
skewthescript. org
predicted y‐value y‐intercept Slope skewthescript. org
predicted y‐value y‐intercept Slope skewthescript. org
1) Interpret the slope value… skewthescript. org
1) Interpret the slope value… skewthescript. org
1) Interpret the slope value… For every 1 unit increase in explanatory variable, our model predicts an average increase/decrease of slope in response variable. skewthescript. org
1) Interpret the slope value… For every 1 unit increase in explanatory variable, our model predicts an average increase/decrease of slope in response variable. For every 1 percentage point increase in attendance, our model predicts an average increase of 0. 5669 points in students’ test scores. skewthescript. org
2) Interpret the y-intercept… skewthescript. org
2) Interpret the y-intercept… skewthescript. org
2) Interpret the y-intercept… When the explanatory variable is zero units, our model predicts that the response variable would be y‐intercept. skewthescript. org
2) Interpret the y-intercept… When the explanatory variable is zero units, our model predicts that the response variable would be y‐intercept. When attendance is zero percent, our model predicts that the students’ test scores would be ‐ 7. 689. skewthescript. org
2) Interpret the y-intercept… When the explanatory variable is zero units our model This y‐int value is not meaningful, since predicts that the response variable would be y‐intercept. anyone with 0% attendance doesn’t When attendance is zero percent our model predicts really go to the school or take the exam. that the students’ test scores would be ‐ 7. 689. skewthescript. org
Topics 1. 2. 3. 4. Least squares regression line (LSRL) Slope and y‐intercept Predictions using the LSRL Dangers of prediction skewthescript. org
Superintendent: “If a student meets our minimum attendance goal (87%), what would their predicted test score be? ” skewthescript. org
y = mx + b y = 2 x + 1 Find the y-value when x = 2: skewthescript. org
y = mx + b skewthescript. org
y = mx + b x=2 skewthescript. org
y = mx + b x=2 skewthescript. org
y = mx + b y=5 x=2 skewthescript. org
Superintendent: “If a student meets our minimum attendance goal (87%), what would their predicted test score be? ” skewthescript. org
Superintendent: “If a student meets our minimum attendance goal (87%), what would their predicted test score be? ” skewthescript. org
Superintendent: “If a student meets our minimum attendance goal (87%), what would their predicted test score be? ” x = 87 skewthescript. org
Superintendent: “If a student meets our minimum attendance goal (87%), what would their predicted test score be? ” x = 87 skewthescript. org
Superintendent: “If a student meets our minimum attendance goal (87%), what would their predicted test score be? ” x = 87 skewthescript. org
Topics 1. 2. 3. 4. Least squares regression line (LSRL) Slope and y‐intercept Predictions using the LSRL Dangers of prediction skewthescript. org
Is attendance the solution? Poverty High Will raising attendance Low Attendance also raise scores? High Low Scores skewthescript. org
skewthescript. org
Of course attending school helps! skewthescript. org
Fixing attendance In the past several years, superintendents have piloted large‐scale (and sometimes quite expensive) initiatives to improve student attendance. These included: 1. Call programs for chronically absent students 2. Hiring attendance case managers and coordinators 3. Using Uber/Lyft for students with transportation issues skewthescript. org
The result Attendance Pyne, Grodsky, et al. , (2018). What Happens When Children Miss School? Unpacking Elementary School Absences in MMSD. Madison, WI: Madison Education Partnership. skewthescript. org
The result Test Scores Pyne, Grodsky, et al. , (2018). What Happens When Children Miss School? Unpacking Elementary School Absences in MMSD. Madison, WI: Madison Education Partnership. skewthescript. org
The result Test Scores Pyne, Grodsky, et al. , (2018). What Happens When Children Miss School? Unpacking Elementary School Absences in MMSD. Madison, WI: Madison Education Partnership. skewthescript. org
The result 1. Test scores overall did not rise significantly 2. The achievement gaps remained Pyne, Grodsky, et al. , (2018). What Happens When Children Miss School? Unpacking Elementary School Absences in MMSD. Madison, WI: Madison Education Partnership. skewthescript. org
Lesson 3. 2 Discussion skewthescript. org
Poverty High Low Attendance High Low Scores skewthescript. org
Discussion Question: Why didn’t raising attendance work? Poverty High Low Attendance High Low Scores skewthescript. org
Correlation ≠ Causation skewthescript. org
Is attendance the solution? Poverty Low Attendance Low Scores skewthescript. org
Is attendance the solution? Poverty Low Attendance Low Scores Hunger Worse Schools Less Study Time (job, taking care of family) skewthescript. org
Is attendance the solution? Poverty Low Attendance Low Scores Hunger Worse Schools Less Study Time (job, taking care of family) Cause skewthescript. org
Is attendance the solution? Poverty High Low Attendance Low Scores Hunger Worse Schools Less Study Time (job, taking care of family) Cause skewthescript. org
Is attendance the solution? Poverty High Low Attendance Low Scores Hunger Worse Schools Less Study Time (job, taking care of family) Cause skewthescript. org
Is attendance the solution? Poverty High Low Attendance Low Scores Hunger Worse Schools Less Study Time (job, taking care of family) Cause skewthescript. org
Lesson 3. 2 Practice skewthescript. org
1) Dr. Youfa Wang at University of North Carolina published a study* on obesity in America. Using linear regression, the study concluded that by 2048, if trends continue, 100% of Americans would be overweight. Using the graphs** below, do you believe this conclusion is correct? Why or why not? Example inspired by Ellenberg, J. How Not to be Wrong, pg. 50‐ 61. Model Projection Data and Model *Wang, Beydoun, et al. , “Will all Americans become overweight or obese? estimating the progression and cost of the US obesity epidemic. ” Obesity (Silver Spring). 2008; 16(10): 2323‐ 2330. doi: 10. 1038/oby. 2008. 351 **Graphs provided are representative approximations of analyses from the paper skewthescript. org
Extrapolation: using your model to make predictions outside the range of your data. skewthescript. org
Extrapolation is dangerous because the trend may not continue. Extrapolation: using your model to make predictions outside the range of your data. skewthescript. org
Extrapolation is dangerous because the trend may not continue. This logarithmic model is more reasonable – it’s improbable that every single person in the population would become overweight. skewthescript. org
Side note: A logarithmic fit may be better for the grocery store data from last lesson. Presumably, the grocery chain has an upper limit on the organic options it can provide from its supply chains. skewthescript. org
Side note: A logarithmic fit may be better for the grocery store data from last lesson. Presumably, the grocery chain has an upper limit on the organic options it can provide from its supply chains. skewthescript. org