DRS 1013 Discriminant Analysis Discriminant Analysis Overview n

  • Slides: 17
Download presentation
D/RS 1013 Discriminant Analysis

D/RS 1013 Discriminant Analysis

Discriminant Analysis Overview n n multivariate extension of the one-way ANOVA looks at differences

Discriminant Analysis Overview n n multivariate extension of the one-way ANOVA looks at differences between 2 or more groups goal is to discriminate between groups considers several predictor variables simultaneously

Discriminant - Overview n n n provides a way to describe differences between groups

Discriminant - Overview n n n provides a way to describe differences between groups in simple terms. removes the redundancy among large numbers of variables by combining into a smaller number of Discriminant functions can classify cases to groups when their group membership is unknown

Overview (cont. ) n n tests the significance of differences between two or more

Overview (cont. ) n n tests the significance of differences between two or more groups examines several predictor variables simultaneously construct linear combination of these variables, forming a single composite variable called a discriminant function basically MANOVA flipped upside down

Discriminant parallels with MANOVA and Regression n Discriminant works the other way, predicting group

Discriminant parallels with MANOVA and Regression n Discriminant works the other way, predicting group membership by some kind of scores The discriminant function takes the form: D = d 1 z 1 + d 2 z 2 +. . . + dpzp

Discriminant functions n Di = d 1 z 1 + d 2 z 2

Discriminant functions n Di = d 1 z 1 + d 2 z 2 +. . . + dpzp – where, D = scores on the discriminant function – d 1 - dp = discriminant function weighting coefficients for each of p predictor variables – z 1 - zp = standardized scores on the original p variables

Unstandardized Functions n n even more like regression equation Di = a + d

Unstandardized Functions n n even more like regression equation Di = a + d 1 x 1 + d 2 x 2 +. . . + dpxp –a = the discriminant function constant – d 1 - dp = discriminant function weighting coefficients for each of p predictor variables – x 1 - xp = raw scores on the original p variables

Forming discriminant functions n n n discriminant function is formed to maximize the F

Forming discriminant functions n n n discriminant function is formed to maximize the F value associated with the D F = bg variance on D / wg variance on D provides a function with the greatest discriminating power.

Functions beyond the first n n n first function is one of many combinations

Functions beyond the first n n n first function is one of many combinations of the p original predictor variables. # of useful functions is p (# of original variables) or k-1 (k=# of groups being considered), whichever is smaller. later functions maximize the separation between groups and are orthogonal with the preceding functions.

First discriminant function (3 gps) Separates group 1 from groups 2 & 3

First discriminant function (3 gps) Separates group 1 from groups 2 & 3

Second function (3 gps) Separates group 3 from groups 1 & 2

Second function (3 gps) Separates group 3 from groups 1 & 2

Both functions together Orthogonal = uncorrelated

Both functions together Orthogonal = uncorrelated

Confusion Matrix n n n assign cases to groups based on their discriminant function

Confusion Matrix n n n assign cases to groups based on their discriminant function scores assignments compared with actual group memberships confusion matrix gives both overall accuracy of classification and the relative frequencies of various types of misclassification

Confusion matrix: example S our proportion correct is (43 + 39)/100=. 82 S by

Confusion matrix: example S our proportion correct is (43 + 39)/100=. 82 S by chance alone we would end up with. 50 correct S if we evenly divided our group assignments S between A & B half in each group correct by chance S can consider prior probabilities, if known

Cross Validation n hold back some of the data to test the model that

Cross Validation n hold back some of the data to test the model that emerges gives good idea of the kind of predictive accuracy we can expect for another sample small samples and several variables unlikely to replicate across samples

Classification Functions n n n weights and constants used to calculate scores for each

Classification Functions n n n weights and constants used to calculate scores for each case as many scores as there are groups for each case assign to group that the case has the highest classification function score for

Assumptions n n n assumes that all predictors follow a multivariate normal distribution test

Assumptions n n n assumes that all predictors follow a multivariate normal distribution test is robust with respect to normality, in practice, lack of normality doesn't make much of a difference especially with large n and moderate number of predictors