Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION REGRESSION

  • Slides: 33
Download presentation
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION

Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION

REGRESSION DIAGNOSTICS

REGRESSION DIAGNOSTICS

Testing Regression Assumptions o Prior to Analysis n n Normal distribution Outliers Linear relationships

Testing Regression Assumptions o Prior to Analysis n n Normal distribution Outliers Linear relationships Multicollinearity

Multicollinearity o Interrelatedness of independent variables o Indications n n n High correlations between

Multicollinearity o Interrelatedness of independent variables o Indications n n n High correlations between variables (. 85) Substantial R squared, but statistically insignificant coefficients Unstable regression coefficients Unexpected size of coefficients Unexpected signs (+/-)

Measures of Collinearity o Tolerance o Variance Inflation Factor (VIF) o Eigenvalues o Condition

Measures of Collinearity o Tolerance o Variance Inflation Factor (VIF) o Eigenvalues o Condition Index o Variance Proportions

Tolerance o Measure of collinearity o Proportion of variance in a variable that is

Tolerance o Measure of collinearity o Proportion of variance in a variable that is not accounted for by the other independent variables o Each independent variable is regressed on the other independent variables o High multiple correlation indicates variable is highly related to other independent variables

Tolerance o Tolerance equals 1 - Rsquared o Tolerance of 0 (1 -1) would

Tolerance o Tolerance equals 1 - Rsquared o Tolerance of 0 (1 -1) would indicate perfect collinearity o Tolerance of 0 indicates the independent variable is a perfect linear combination of the other variables o Small tolerances ( <0. 1) are indicative of problem with multicollinearity

Variance Inflation Factor (VIF) o Reciprocal of tolerance o High tolerance associated with low

Variance Inflation Factor (VIF) o Reciprocal of tolerance o High tolerance associated with low VIF

Eigenvalues o Measure of the cross-product matrix o Finding some eigenvalues that are much

Eigenvalues o Measure of the cross-product matrix o Finding some eigenvalues that are much larger than others indicates an illconditioned data matrix o Ill-conditioned matrix leads to large changes in solution with only small changes in independent and/or dependent variable

Condition Index o Square root of the ratios of largest eigenvalue to each successive

Condition Index o Square root of the ratios of largest eigenvalue to each successive eigenvalue o >15 indicates possible problem o >30 indicates serious problem

Variance Proportions of the variance accounted for by each principal component associated with each

Variance Proportions of the variance accounted for by each principal component associated with each of the eigenvalues o Collinearity is a problem when a component associated with a high condition index contributes substantially to the variance of two or more variables

RESIDUAL o The difference between the actual and the predicted score (Y - Y')

RESIDUAL o The difference between the actual and the predicted score (Y - Y')

Residual Analysis o Normal Distribution o Homoscedasticity

Residual Analysis o Normal Distribution o Homoscedasticity

Residual Analysis o Normal Distribution of residuals indicates: n n linear relationships normal distribution

Residual Analysis o Normal Distribution of residuals indicates: n n linear relationships normal distribution of dependent variable for each value of the independent variable o Assessment n n histogram of standardized residuals probability plot

Residual Analysis o Homoscedasticity n Plot residuals against predicted values and against independent variables

Residual Analysis o Homoscedasticity n Plot residuals against predicted values and against independent variables

Computer Exercise o What is the multiple correlation of three sets of predictors and

Computer Exercise o What is the multiple correlation of three sets of predictors and overall state of health? o First set = age and years of education o Second set = confidence and life satisfaction o Third Set = smoking history and satisfaction with current weight

SPSS - Multiple Regression/Residuals o Statistics p Confidence intervals p R squared change p

SPSS - Multiple Regression/Residuals o Statistics p Confidence intervals p R squared change p Descriptives p Part & Partial correlations p Collinearity diagnostics p Residuals n n Durbin Watson Casewise diagnostics

SPSS - Residual Analysis (cont. ) o Options n exclude cases pairwise o Plots

SPSS - Residual Analysis (cont. ) o Options n exclude cases pairwise o Plots n n n Histogram Normal probability plot Produce all partial plots

Example from the Literature

Example from the Literature

CANONICAL CORRELATION

CANONICAL CORRELATION

CANONICAL CORRELATION o Measures the relationship between a set of independent variables and a

CANONICAL CORRELATION o Measures the relationship between a set of independent variables and a set of dependent variables o Method of least squares n Two composites p independent variables, "on the left" p dependent variables, "on the right"

Canonical Correlation o Type of Data Required n n n Data at all levels

Canonical Correlation o Type of Data Required n n n Data at all levels may be entered Categorical variables must be coded Continuous variables should meet assumptions

Assumptions o Sample must be representative of population o Variables must have normal distribution

Assumptions o Sample must be representative of population o Variables must have normal distribution o Homoscedasticity o Linear relationships

CANONICAL CORRELATION o Canonical correlation coefficients o Maximum number equals the number of variables

CANONICAL CORRELATION o Canonical correlation coefficients o Maximum number equals the number of variables in the smaller set.

CANONICAL CORRELATION o Canonical variate o A weighted composite of the variables in a

CANONICAL CORRELATION o Canonical variate o A weighted composite of the variables in a set. o "New" variable

CANONICAL CORRELATION o Coefficients n n n Raw Standardized Structure

CANONICAL CORRELATION o Coefficients n n n Raw Standardized Structure

CANONICAL CORRELATION o Raw Coefficients n n Like b -weights in regression Can be

CANONICAL CORRELATION o Raw Coefficients n n Like b -weights in regression Can be used to calculate predicted scores, based on actual scores

CANONICAL CORRELATION o Canonical weights n n Standard score form Similar to standardized regression

CANONICAL CORRELATION o Canonical weights n n Standard score form Similar to standardized regression coefficients (Betas) Indicate the relative importance of the associated variable Unstable

CANONICAL CORRELATION o Structure Coefficients n n Correlation between the canonical variates and the

CANONICAL CORRELATION o Structure Coefficients n n Correlation between the canonical variates and the original variables Loadings of. 30 or higher are treated as meaningful Interpreted like loadings in factor analysis Square of the loading is the proportion of variance accounted for

WILKS’ LAMBDA o Varies from 0 to 1 o Error variance o Equal to

WILKS’ LAMBDA o Varies from 0 to 1 o Error variance o Equal to 1 - R square o The smaller the value, the greater the variance explained o Tested for significance with Bartlett's test, a chi-square statistic

CANONICAL CORRELATION o Redundancy o The higher the redundancy or correlation among a group

CANONICAL CORRELATION o Redundancy o The higher the redundancy or correlation among a group of variables, the better the ability to predict from one group to another.

Example from the Literature

Example from the Literature

CANONICAL CORRELATION o Exercise o What is the canonical correlation between the following two

CANONICAL CORRELATION o Exercise o What is the canonical correlation between the following two sets of variables? n n The predictor set includes: age, education, smoking history, depressed state of mind, exercise, and current quality of life. The outcome set includes: positive psychological attitudes and overall state of health.