SCATTER DIAGRAMS CCEA GCSE Statistics Investigating Correlation Scatter
- Slides: 9
SCATTER DIAGRAMS CCEA GCSE Statistics
Investigating Correlation Scatter diagrams are used to investigate if two variables are correlated: � Positive Correlation: an increase in one variable is matched by an increase in the other variable � Negative Correlation: an increase in one variable is matched by a decrease in the other variable � Zero Correlation: no relationship between the two variables The strength of the correlation is indicated by how closely the points conform to a straight line
Measuring Correlation ranges from perfect positive to perfect negative correlation The strength of the correlation can be represented by a statistic known as the Product Moment Correlation Coefficient (PMCC) which ranges from +1 (perfect positive correlation) to -1 (perfect negative correlation) The PMCC (denoted by the letter r) is explained in more detail elsewhere in the course notes
Positive Correlation Examples of positive correlation
Zero/No Correlation Example of zero/no correlation
Negative Correlation Examples of negative correlation
Outliers Even when strong correlation exists there may exist a data point which does not fit the overall trend This data point is called an outlier Outliers can sometimes occur randomly but are usually due an error in the measurement process for collecting the data
Outliers In the diagram we see strong positive correlation (r = +0. 92) but point A is clearly an outlier Point B is perfectly in line with the general trend in the dataset and so is not an outlier
Outliers Outliers need to be identified during the modelling process and removed from the dataset If outliers are not removed they could adversely affect the fitting process The line of best fit in the previous diagram ignored the outlier and hence was an excellent fit for all the other points It the outlier was included the resulting line of best fit would be flatter and not such a good fit overall