Screen Lecturers desk Row A Row B Row

Screen Lecturer’s desk Row A Row B Row A 15 14 12 11 10 13 20 Row B 19 24 23 22 21 Row C 20 19 28 27 26 25 24 23 Row D 22 21 20 19 30 29 28 27 26 25 24 23 Row E 23 22 21 20 19 35 34 33 32 31 30 29 28 27 26 Row F 25 35 34 33 32 31 30 29 28 27 26 Row G 37 36 35 34 33 32 31 30 29 28 41 40 39 38 37 36 35 34 33 32 31 30 Row C Row D Row E Row F Row G Row H Row L 33 31 29 25 23 22 21 21 8 7 6 5 3 4 Row A 2 1 3 2 Row B 9 8 7 6 5 4 12 11 10 9 8 7 6 5 4 3 2 1 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Row F 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Row G Row H 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 29 Row J 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Row J 29 Row K 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Row K 25 Row L 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 20 19 Row M 18 4 3 Row N 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Row P 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 4 3 32 31 30 29 28 27 26 Row M 9 18 17 18 16 17 15 16 18 14 15 17 18 13 14 13 16 17 12 11 10 15 16 14 15 13 12 11 10 14 17 16 15 14 13 12 11 10 9 13 8 7 6 5 1 1 Row C Row D Row E 11 10 9 8 7 6 5 4 3 2 2 1 1 1 Row L Row M table 14 13 12 11 10 9 8 7 6 Projection Booth Harvill 150 renumbered 5 2 1 Left handed desk Row H

Introduction to Statistics for the Social Sciences SBS 200 - Lecture Section 001, Fall 2018 Room 150 Harvill Building 10: 00 - 10: 50 Mondays, Wednesdays & Fridays. 11/28/18

A note on doodling

All remain ing homework assignmen ts are availa ble on clas s website

Study Guide for Exam 4 is available on class website Schedule of readings Before our fourth and final exam (December 3 rd) Open. Stax Chapters 1 – 13 (Chapter 12 is emphasized) Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

Over next couple of lectures 11/26/18 ü Logic of hypothesis testing with Correlations ü Interpreting the Correlations and scatterplots ü Simple and Multiple Regression Using correlation for predictions r versus r 2 Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent) Coefficient of correlation is name for “r” Coefficient of determination is name for “r 2” (remember it is always positive – no direction info) Standard error of the estimate is our measure of the variability of the dots around the regression line (average deviation of each data point from the regression line – like standard deviation) Coefficient of regression will “b” for each variable (like slope)

Pic ass k up ign p me ast nt s l a n io t p o eek s Lab his w t

Correlation: Independent and dependent variables • When used for prediction we refer to the predicted variable as the dependent variable and the predictor variable as the independent variable What are we predicting? Dependent Variable Independent Variable Dependent Variable What are we predicting? Independent Variable

Correlation - What do we need to define a line If you probably make this much Yearly Income Y-intercept = “a” (also “b 0”) Where the line crosses the Y axis Slope = “b” (also “b 1”) How steep the line is Expenses per year If you spend this much • The predicted variable goes on the “Y” axis and is called the dependent variable • The predictor variable goes on the “X” axis and is called the independent variable

Angelina Jolie Buys Brad Pitt a $24 million Heart-Shaped Island for his 50 th Birthday Yearly Income Angelina probably makes this much Dustin probably makes this much Expenses Dustin spent per year this much Angelina spent this much Dustin spends $12 for his Birthday Revisit this slide

Assumptions Underlying Linear Regression • For each value of X, there is a group of Y values • • These Y values are normally distributed. The means of these normal distributions of Y values all lie on the straight line of regression. • The standard deviations of these normal distributions are equal. Revisit this slide

Correlation - the prediction line - what is it good for? Prediction line • makes the relationship easier to see (even if specific observations - dots - are removed) • identifies the center of the cluster of (paired) observations • identifies the central tendency of the relationship (kind of like a mean) • can be used for prediction • should be drawn to provide a “best fit” for the data • should be drawn to provide maximum predictive power for the data • should be drawn to provide minimum predictive error

Cost will be about 95. 06 Cost Predicting Restaurant Bill Prediction line Y’ = a + b 1 X 1 Y-intercept People The expected cost for dinner for two couples If People = 4 (4 people) would be $95. 06 Cost = 15. 22 + 19. 96 Persons If “Persons” = 4, what is the prediction for “Cost”? Cost = 15. 22 + 19. 96 Persons Cost = 15. 22 + 19. 96 (4) Cost = 15. 22 + 79. 84 = If “Persons” = 1, what is the prediction for “Cost”? 95. 06 Cost = 15. 22 + 19. 96 Persons Cost = 15. 22 + 19. 96 (1) Cost = 15. 22 + 19. 96 = Slope

Rent will be about 990 Cost Predicting Rent Prediction line Y’ = a + b 1 X 1 Y-intercept Square If Sq. Ft. Feet = 800 Slope The expected cost for rent on an 800 square foot apartment is $990 Rent = 150 + 1. 05 Sq. Ft If “Sq. Ft” = 800, what is the prediction for “Rent”? Rent = 150 + 1. 05 Sq. Ft Rent = 150 + 1. 05 (800) Rent = 150 + 840 = 990 If “Sq. Ft” = 2500, what is the prediction for “Rent”? Rent = 150 + 1. 05 Sq. Ft Rent = 150 + 1. 05 (2500) Rent = 150 + 2625 = 2, 775

We refer to the predicted variable as the dependent variable (Y) and the predictor variable (X) as the independent variable Why are we finding the regression line? How would we use it? ion s s re ent g e r ici e) f f coe (slop corr e coef lation ficie (“r”) nt

regression coefficient (“b”) or (slope) What variable are we predicting? a. Height of Boys in 1990 (cm) b. Age of boys in 1990 c. Both height and age of boys in 1990 Correlation coefficient (“r”) Coefficient of determination (“r 2”) (“amount of variance accounted for”) ermination t e d f o t n ie for”) d r 2 = Coeffic e t n u o c c a e arianc (“amount of v

regression coefficient (“b”) or (slope) Correlation coefficient (“r”) Coefficient of determination (“r 2”) (“amount of variance accounted for”) If a boy is 8 -years old how tall would we predict he would be? Complete prediction “by eye” looking at graph? a. 40 cm b. 80 cm c. 120 cm d. 160 cm math e h t o d s n let’ u f 122 r o f = t 6 s 7 4 Ju. + 64 ) 8 ( 1 2 5 1 Y’ = 7.

regression coefficient (“b”) or (slope) Correlation coefficient (“r”) Coefficient of determination (“r 2”) (“amount of variance accounted for”) If a boy is 2 -years old how tall would we predict he would be? Complete prediction “by eye” looking at graph? a. 40 cm b. 80 cm c. 120 cm d. 160 cm math e h t o d s n let’ u f r 78. 8 o f = t s 6 7 Ju 4. + 64 ) 2 ( 1 2 5 Y’ = 7. 1

regression coefficient (“b”) or (slope) Correlation coefficient (“r”) What variable are we predicting? a. Size of state (square miles) b. Number of letters in name of state c. Both size of state and number of letters Coefficient of determination (“r 2”) (“amount of variance accounted for”)

If a state has 7 letters in the name (like Arizona) how large would we predict the state to be? Complete prediction “by eye” looking at graph? a. 20, 000 square miles b. 30, 000 square miles c. 40, 000 square miles d. 50, 000 square miles Residual is measure of error for any one data point y-y’ 114, 0 00 – 50 64, 000 = 0 114, 000 math e h t o d let’s n u , 953 f 9 r 4 o f = t s 4 u 8 J 67, 8 + ) 7 ( 5. 61 Y’ = -2, 5

regression coefficient (“b”) or (slope) What variable are we predicting? a. Size of TV (inches) b. Sales price of TV ($) c. Both sales price and size of TV Correlation coefficient (“r”) Coefficient of determination (“r 2”) (“amount of variance accounted for”)

If a TV is 55 inches what would we predict cost to? Complete prediction “by eye” looking at graph? a. $1, 500 b. $1, 725 ath m e h c. $2, 000 t o let’s d n 235 u , f 2 r $ o f = d. $2, 225 t s 5 u J 762. 2 ) – 5 5 ( 4 4 Y’ = 55. 0

If a TV is 40 inches what would we predict cost to? Complete prediction “by eye” looking at graph? a. $1, 500 b. $1, 725 c. $2, 000 math e h t o d let’s 9 n u d. $2, 225 f r o f 5 = $1, 43 st Ju 62. 2 7 – ) 0 4 44 ( 0. 5 5 = ’ Y

What variable are we predicting? a. Amount of Wine Consumed b. Death Rate in the Country c. Both Amount of Wine Consumed and Death Rate

If a country consumes an average of 8 liters (per capita) what would we predict death rate from heart disease be? Complete prediction “by eye” looking at graph? a. 50 b. 75 math e h t o c. 100 d let’s n u f 5. 6 7 r o = f 3 t s 6 u. J d. 125 ) + 266 . 878 Y’ = -23 (8