Exploring relationships between variables Ch 10 Scatterplots Associations
Exploring relationships between variables Ch. 10 Scatterplots, Associations, and Correlations
Scatterplots • • • Shows change over time Shows patterns Shows Trends Relationships Outlier values
Scatterplots • Can be positive or negative • Show relationship amongst 2 variables • Can be shown more in depth through the Z-scores of both variables (ZX, ZY)
Z-scores • X-Mean. X / Standard Deviation (SX) • Y-Mean. Y / Standard Deviation (SY) • Calculating standard deviation in the same way as before.
Ratio • Correlation coefficient • Sum of SX * SY / n-1 • Correlation measures the strength of the linear association between 2 variables
variables • Explanatory Variable – X • Response Variable - Y
Least-Squares Line • • • Y= a + bx a = y intercept b = slope a = y – bx b = SSxy/SSx = Sum of squares of x
SSx • This is calculated by obtaining the sum of each squared x • You then subtract the sum of x squared divided by n • You can get SSx on the calculator by squaring the standard deviation then multiplying it by (n-1)
SSxy • Sum of squares of x and y • Take the sum of each x value times each y value. • You then subtract from that total the (Sum of x) * (Sum of y) n
SSxy • SSxy is a more efficient way of computing • Sum of each (x-xbar) * (y-ybar)
Complete Guided Ex. #3 page 566
Standard Error of Estimate • Se = square root of E(y-yp)squared/n – 2 • How to calculate square root of SDY – b(SDx * SDy) / n-2
Residuals • You can graph the residual of the equation to see if the regression is accurate • Residuals are the difference between the observed value and the predicted value • R = observed - predicted
Confidence Intervals • Yp – E < yp + E • Yp = predicted value of y
What does this mean (better understanding)
Types of data • • Outlier Leverage Influential Point Lurking Variable
Outlier • Any data point that stands away from the others
Leverage • Data points with X-values that are far from the mean • Can alter the line of least regression
Influential Point • Omitting this point can drastically alter the regression model
Lurking Variable • A variable that is hidden in the equation • It is not explicitly part of the model but affects the way the variables in the model appear
- Slides: 20