S 1 Chapter 6 Correlation www drfrostmaths com

  • Slides: 27
Download presentation
S 1 : : Chapter 6 Correlation www. drfrostmaths. com Dr J Frost (jfrost@tiffin.

S 1 : : Chapter 6 Correlation www. drfrostmaths. com Dr J Frost (jfrost@tiffin. kingston. sch. uk) Last modified: 20 th January 2016

Recap of correlation Correlation gives the strength of the relationship (and the type of

Recap of correlation Correlation gives the strength of the relationship (and the type of relationship) between two variables. Weekly time on internet (hours) 90 Maths Score 80 70 60 Type of correlation: Weak ? positive ? correlation 50 40 30 20 10 strength 0 0 10 20 30 40 50 60 70 80 90 100 English Score Weak ? negative ? correlation 25 100 20 15 10 5 0 type 0 20 40 Age 60 80 100 £ 70. 00 £ 50. 00 £ 40. 00 Crime Rate Cost of train fare £ 60. 00 £ 30. 00 £ 20. 00 Strong ? positive ? correlation £ 10. 00 £ 0. 00 0 50 100 Distance travelled (km) 150 No ? correlation 40 35 30 25 20 15 10 5 0 0 10000 20000 30000 40000 50000 Number of people in city called 'Dave' 60000

 Formula based on definition ? Simplified formula ? ? Bro Exam Tip: Given

Formula based on definition ? Simplified formula ? ? Bro Exam Tip: Given in formula booklet, but useful to memorise.

(this won’t be tested in an exam but is intended to provide background) Covariance

(this won’t be tested in an exam but is intended to provide background) Covariance We understand variance as ‘how much a variable varies’. We can extend variance to two variables. We might be interested in how one variable ? varies with another. £ 70. 00 £ 60. 00 Cost of train fare £ 50. 00 £ 40. 00 £ 30. 00 £ 20. 00 £ 10. 00 £ 0. 00 0 20 40 60 80 Distance travelled (km) 100 120 140

(this won’t be tested in an exam but is intended to provide background) Covariance

(this won’t be tested in an exam but is intended to provide background) Covariance Comment on the covariance between the variables. ? ?

(this won’t be tested in an exam but is intended to provide background) Covariance

(this won’t be tested in an exam but is intended to provide background) Covariance Comment on the covariance between the variables. ? ?

 ! Simplified formula ? ? ?

! Simplified formula ? ? ?

Product Moment Correlation Coefficient (PMCC) ! Have an intelligent guess based on the discussion

Product Moment Correlation Coefficient (PMCC) ! Have an intelligent guess based on the discussion above. ? We’ll interpret what that means in a second.

Interpreting the PMCC We’ve seen the PMCC varies between -1 and 1. means Perfect

Interpreting the PMCC We’ve seen the PMCC varies between -1 and 1. means Perfect positive correlation. ? means No correlation? means Perfect negative correlation. ?

Interpreting the PMCC 25 Weekly time on internet (hours) 100 90 70 60 50

Interpreting the PMCC 25 Weekly time on internet (hours) 100 90 70 60 50 40 30 20 15 10 5 0 0 10 20 30 40 50 60 70 80 90 100 English Score 20 40 Age 60 80 100 £ 70. 00 £ 60. 00 £ 50. 00 Crime Rate Cost of train fare Maths Score 80 £ 40. 00 £ 30. 00 £ 20. 00 £ 10. 00 40 35 30 25 20 15 10 5 0 0 20000 40000 60000 Number of people in city called 'Dave' £ 0. 00 0 50 100 Distance travelled (km) 150

Example Baby ? ? A B C D E F 31. 1 33. 3

Example Baby ? ? A B C D E F 31. 1 33. 3 30. 0 31. 5 35. 0 30. 2 36 37 38 38 40 40 ? ? ? ?

Let’s do it on our calculators! Baby A B C D E F 31.

Let’s do it on our calculators! Baby A B C D E F 31. 1 33. 3 30. 0 31. 5 35. 0 30. 2 36 37 38 38 40 40

Test Your Understanding June 2013 Q 1 ? ? ?

Test Your Understanding June 2013 Q 1 ? ? ?

Further Practice Quite often the values are given to you in an exam. ?

Further Practice Quite often the values are given to you in an exam. ? ?

Interpreting the PMCC “Interpret” vs “State” In general in Statistics exams, the word ‘interpret’

Interpreting the PMCC “Interpret” vs “State” In general in Statistics exams, the word ‘interpret’ means “explain in context using non-statistical language”. A bad answer (that may or may not be accepted): “Strong negative correlation” (this is stating the correlation not ? interpreting it) A good answer: “As the waiting time increases, the customer satisfaction tends ? to decrease”.

Exam Questions (on provided sheet) Q 1 ? ? ?

Exam Questions (on provided sheet) Q 1 ? ? ?

(Before you go on to Q 2) Effects of coding ? ? Unaffected!?

(Before you go on to Q 2) Effects of coding ? ? Unaffected!?

Example 1020 1032 1028 1034 1023 1038 320 335 345 355 360 380 0

Example 1020 1032 1028 1034 1023 1038 320 335 345 355 360 380 0 12 8 14 3 18 4 7 9 11 12 16 We can now just find the PMCC of this new data set, and no further adjustment is needed. ?

Exam Questions (on provided sheet) Q 2 ? ? ?

Exam Questions (on provided sheet) Q 2 ? ? ?

Exam Questions (on provided sheet) Q 3 ? ?

Exam Questions (on provided sheet) Q 3 ? ?

Exam Questions (on provided sheet) Q 4 ? ? ?

Exam Questions (on provided sheet) Q 4 ? ? ?

Exam Questions (on provided sheet) Q 5 ? ? ?

Exam Questions (on provided sheet) Q 5 ? ? ?

Exam Questions (on provided sheet) Q 6 ? ? ?

Exam Questions (on provided sheet) Q 6 ? ? ?

Exam Questions (on provided sheet) Q 7 ? ? ?

Exam Questions (on provided sheet) Q 7 ? ? ?

Exam Questions (on provided sheet) Q 8 ? ? ?

Exam Questions (on provided sheet) Q 8 ? ? ?

Exam Questions (on provided sheet) Q 9 ? ? ?

Exam Questions (on provided sheet) Q 9 ? ? ?

Limitations of correlation Often there’s a 3 rd variable that explains two others, but

Limitations of correlation Often there’s a 3 rd variable that explains two others, but the two variables themselves are not connected. Q 1: The number of cars on the road has increased, and the number of DVD recorders bought has decreased. Is there a correlation between the two variables? Buying a car does not necessarily mean that you will not buy a DVD recorder, ? so we cannot say there is a correlation between the two. Q 2: Over the past 10 years the memory capacity of personal computers has increased, and so has the average life expectancy of people in the western world. Is there are correlation between these two variables? The two are not connected, but both are due to scientific development over ? time (i. e. a third variable!)