Cluster Analysis 1 Single Link Cluster Analysis 2

  • Slides: 10
Download presentation
Cluster Analysis 1. Single Link Cluster Analysis 2. Ward’s Minimum Sum of Squares 3.

Cluster Analysis 1. Single Link Cluster Analysis 2. Ward’s Minimum Sum of Squares 3. k-Means Cluster Analysis 4. SPSS Two. Step Cluster Analysis

Single-Link Clustering (most popular method) Cost (Importance) . . . Left Single Link: Join

Single-Link Clustering (most popular method) Cost (Importance) . . . Left Single Link: Join item to cluster which has the single closest member. A Right C B q . Since B<q, join the star to the Left cluster, even though A>q and C>q. Complete Pain Relief (Importance)

Cluster Analysis Single Chain Agglomerative Procedure (most popular method) Part-Worth Coefficients of “Complete Pain

Cluster Analysis Single Chain Agglomerative Procedure (most popular method) Part-Worth Coefficients of “Complete Pain Relief” Therapy A 2 Therapy B 5 Therapies CD 9 10 Therapy E 15 Single Link: Join item to cluster which has the single closest member. First Stage: A= 2 B=5 C=9 AB= 3 BD=5 AC=6 BE=10 AD=8 CD= 1 AE=13 CE=6 BC= 4 DE=5 CDA=7 CDB=4 CDE=5 AB= 3 AE =13 BE =10 ABE=10 CDE=5 Second Stage: (Euclidian Distance) Third Stage: Fourth Stage: Fifth Stage: ABCD=4 ABCDE=5 D=10 E=15

Single Chain Agglomerative Clustering Output: Dendogram 5 4 3 1 A B C D

Single Chain Agglomerative Clustering Output: Dendogram 5 4 3 1 A B C D E

Strength (Importance) Ward’s Clustering . . . Left Ward’s Cluster: Join item to cluster

Strength (Importance) Ward’s Clustering . . . Left Ward’s Cluster: Join item to cluster which has the smallest distance ESS. In this case, if star is joined to left cluster, ESS=A 2+B 2+C 2+D 2 C Right D B A . = mean location of points in proposed cluster Water Resistance (Importance)

Ward’s Minimum Variance Agglomerative Clustering Procedure First Stage: A= 2 Second Stage: Third Stage:

Ward’s Minimum Variance Agglomerative Clustering Procedure First Stage: A= 2 Second Stage: Third Stage: Fifth Stage: C=9 AB= 4. 5 D=10 E=15 BD=12. 5 AC=24. 5 BE=50. 0 AD=32. 0 CD= 0. 5 AE=84. 5 CE=18. 0 BC= 8. 0 DE=12. 5 CDA=38. 0 CDB=14 AE =85 Fourth Stage: B=5 CDE=20. 66 AB= 5. 0 BE =50. 5 ABCD=41. 0 ABE=93. 17 ABCDE=98. 8 CDE=25. 18

Ward’s Minimum Variance Agglomerative Clustering Output 98. 8 25. 18 5 0. 5 A

Ward’s Minimum Variance Agglomerative Clustering Output 98. 8 25. 18 5 0. 5 A B C D E

k-Means Clustering 1. Begin with two starting center points and allocate each item to

k-Means Clustering 1. Begin with two starting center points and allocate each item to nearest cluster center. 2. Recalculate center of clusters. Stop if center hasn’t changed. 3. Allocate items to nearest cluster center. Goto 2.

k-Means Clustering 1 4 A A B 2 B 5 A A B 3

k-Means Clustering 1 4 A A B 2 B 5 A A B 3 A B B

SPSS Two. Step Cluster Method -scalable cluster analysis algorithm designed to handle very large

SPSS Two. Step Cluster Method -scalable cluster analysis algorithm designed to handle very large data sets. -can handle both continuous and categorical variables or attributes. -automatically select the number of clusters. Step 1: pre-cluster the cases (or records) into many small sub-clusters; Step 2: cluster the sub-clusters resulting from pre-cluster step into the desired number of clusters.