Applied Multivariate Quantitative Methods Cluster Analysis By Jenpei
Applied Multivariate Quantitative Methods Cluster Analysis By Jen-pei Liu, Ph. D Division of Biometry, Department of Agronomy, National Taiwan University and Wei-Chie, MD, Ph. D Department of Public Health National Taiwan University 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 1
Cluster Analysis n n n Introduction Measures of Similarity Hierarchical Clustering K-mean Clustering Summary 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 2
Introduction n A sample of n objects, each with measurements of p variables To use the measurements of p variables to devise a scheme for grouping n objects into classes Similar objects are in the same class 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 3
Introduction n n In general, the number of clusters is not known in advance – unsupervised analysis The number of class is pre-specified in the discriminant analysis and is based on a predicted function– supervised analysis 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 4
Introduction n Examples n n Cluster of depressed patients Data reduction n Marketing n n Microarray n n 9/3/2021 Test markets: large number of cities Small number of groups of similar cities one member from each group selected for testing Clusters of genes Clusters of subjects Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 5
Introduction n Types of Clustering Methods n Hierarchical Clustering n n n Partitional Method n n 9/3/2021 To find a series of partition A bottom-to-up clustering To produce a single partition of objects A up-to-bottom clustering Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 6
Introduction Example Student 1 2 3 4 5 6 9/3/2021 Chinese (X 1) Math (X 2) 85 82 25 32 65 55 90 95 40 30 60 70 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 7
Measures of Similarity 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 8
Measures of Similarity n Euclidean Distances Matrix for 6 students 1 2 3 4 5 6 9/3/2021 1 2 3 4 0 78. 10 33. 60 13. 93 0 46. 14 90. 52 0 47. 17 0 5 6 68. 77 15. 13 35. 36 82. 01 0 27. 73 51. 66 15. 81 39. 05 44. 72 0 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 9
Measures of Similarity 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 10
Measures of Similarity n The Manhattan (city block) distance: 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 11
Measures of Similarity 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 12
Measures of Similarity 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 13
Measures of Similarity n Correlation coefficient n n n A measure for association Not a measure for similarity (or agreement) Euclidean distance n n 9/3/2021 A measure for agreement Not a measure for association Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 14
Measures of Similarity n Example Case I X 1 X 2 1 1 2 2 3 3 4 4 r=1, d 2=0 9/3/2021 Case II X 1 X 2 1 2 2 4 3 6 4 8 r=1, d 2=30 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D Case III X 1 X 2 1 4 2 8 3 12 4 16 r=1, d 2=270 15
Hierarchical Clustering n General Steps for n objects n n n 9/3/2021 Step 1: There are n clusters at the beginning and each object is a cluster. Compute pairwise distances among all clusters Step 2: Find the minimum distance and merge the corresponding two clusters into one cluster Step 3: Based on n-1 clusters, compute pairwise distances among all n-1 clusters Step 4: Find the minimum distance and merge the corresponding two clusters into one cluster Step 5: Repeat 2 -4 until all n objects merge into one big cluster Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 16
Hierarchical Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 17
Hierarchical Clustering n n Single Linkage (Nearest-neighbor) Method n Use the minimal distance Distance matrix for 5 objects 1 2 3 4 5 1 0 2 9 0 3 3 7 0 4 6 5 9 0 5 11 10 2 8 0 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 18
Hierarchical Clustering n Single Linkage Method n n n Step 1: 5 clusters: {1}, {2}, {3}, {4}, {5} Step 2: min{dij} = d 35 = 2 and merge objects 3 and 5 into one cluster {35} Step 3: Find the minimal distance among {3, 5}, {1}, {2}, {4} n n n 9/3/2021 d{35}1 = min[d 31, d 51]=min[3, 11]=3 d{35}2 = min[d 32, d 52]=min[7, 10]=7 d{35}4 = min[d 34, d 54]=min[9, 8]=8 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 19
Hierarchical Clustering n Single Linkage Method Update the distance matrix {35} 1 2 4 {35} 0 1 3 0 2 7 9 0 4 8 6 5 0 n 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 20
Hierarchical Clustering n Single Linkage Method n n Step 4: Minimal distance is 3 between {35} and {1} and merge {35} and {1} into {135} Step 5: Find the distances between {135} and {2} and {4} n n 9/3/2021 d{135}2 = min[d{35}2, d 12]=min[7, 9]=7 d{135}4 = min[d{35}4, d 14]=min[8, 6]=6 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 21
Hierarchical Clustering n Single Linkage Method n Update the distance matrix {135} 2 4 {135} 0 2 7 0 0 4 6 5 0 The minimal distance is 5 between {2} and {4} Merge {2} and {4} into {24} 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 22
Hierarchical Clustering n Single Linkage Method Find the minimum distance between {135} and {24} n d{135}{24} = min[d{135}2, d{135}4]=min[7, 6]=7 n 9/3/2021 Update the distance matrix {135} {24} {135} 0 {24} 6 0 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 23
Hierarchical Clustering n Single Linkage Method n Distance n n n 9/3/2021 2 3 4 5 6 Clusters {1}, {35}, {2}, {4} {135}, {24} {12345} Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 24
Hierarchical Clustering n Dendrograms n n n 9/3/2021 A 2 -dimensional tree structure rooted in the top One dimension is the distance measure Another dimension is the clustering results The height of vertical (horizontal) line represents the distance between the two clusters it mergers Greater height represents greater distance Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 25
Hierarchical Clustering n n Complete Linkage (Farthest-neighbor) Method n Use the maximal distance Distance matrix for 5 objects 1 2 3 4 5 1 0 2 9 0 3 3 7 0 4 6 5 9 0 5 11 10 2 8 0 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 26
Hierarchical Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 27
Hierarchical Clustering n Complete Linkage Method n n n Step 1: 5 clusters: {1}, {2}, {3}, {4}, {5} Step 2: min{dij} = d 35 = 2 and merge objects 3 and 5 into one cluster {35} Step 3: Find the maximal distance among {3, 5}, {1}, {2}, {4} n n n 9/3/2021 d{35}1 = max[d 31, d 51]=min[3, 11]=11 d{35}2 = max[d 32, d 52]=min[7, 10]=10 d{35}4 = max[d 34, d 54]=min[9, 8]=9 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 28
Hierarchical Clustering n Complete Linkage Method Update the distance matrix {35} 1 2 4 {35} 0 1 11 0 2 10 9 0 4 9 6 5 0 n 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 29
Hierarchical Clustering n Complete Linkage Method n n Step 4: Minimal distance is 5 between {2}, {4} and merge {2} and {4} into {24} Step 5: Find the maximal distances n n 9/3/2021 d{24}{35} = max[d 2{35}, d 4{35}]=max[10, 9]=10 d{24}1 = max[d 21, d 41]=max[9, 6]=9 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 30
Hierarchical Clustering n Complete Linkage Method n Update the distance matrix {35} {24} {35} 0 {24} 10 0 1 11 9 The maximal distance 1 0 0 is 9 between {1} and {24} Merge {1} and {24} into {124} 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 31
Hierarchical Clustering n Complete Linkage Method Find the maximal distance between {124} and {35} n d{124}{35} = min[d 1{35}d{25}{35}] =max[10, 11]=11 n 9/3/2021 Update the distance matrix {35} {124} {35} 0 {124} 11 0 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 32
Hierarchical Clustering n Complete Linkage Method n Distance n n 9/3/2021 2 5 9 11 Clusters {35}, {1}, {2}, {4} {35}, {1}, {24} {35}, {124} {12345} Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 33
Hierarchical Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 34
Average Clustering n Average Linkage Method n Use the average distance 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 35
Average Clustering n n Average Linkage Method n Use the average distance Distance matrix for 5 objects 1 2 3 4 1 0 2 9 0 3 3 7 0 4 6 5 9 0 5 11 10 2 8 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 5 0 36
Hierarchical Clustering n Average Linkage Method n n n Step 1: 5 clusters: {1}, {2}, {3}, {4}, {5} Step 2: min{dij} = d 35 = 2 and merge objects 3 and 5 into one cluster {35} Step 3: Find the average distance among {3, 5}, {1}, {2}, {4} n n n 9/3/2021 d{35}1 =(d 31+d 51)/(2 x 1)=(3+11)/2=7 d{35}2 = (d 32+d 52)/(2 x 1)=(7+10)/2=8. 5 d{35}4 = (d 34+d 54)/(2 x 1)=(9+10)/2=8. 5 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 37
Hierarchical Clustering n Average Linkage Method Update the distance matrix {35} 1 2 4 {35} 0 1 11 0 2 8. 5 9 0 4 9. 5 6 5 0 n 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 38
Hierarchical Clustering n Average Linkage Method n n Step 4: Minimal distance is 5 between {2} and {4} and merge {2} and {4} into {24} Step 5: Find the average distances n n 9/3/2021 d{24}{35} = (d 23+ d 25+d 43+ d 45)/(2 x 2) =(7+10+9+8)/(2 x 2)=8. 5 d{24}1 = (d 21+d 41)/(2 x 1)= =(9+6)/2=7. 5 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 39
Hierarchical Clustering n Average Linkage Method n Update the distance matrix {35} {24} 1 {35} 0 {24} 8. 5 0 0 1 7 7. 5 0 The minimal distance is 7 between {1} and {35} Merge {1} and {35} into {135} 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 40
Hierarchical Clustering n Average Linkage Method Find the average distance between {24} and {135} n d{24}{135} = (d 12+d 14 +d 32+d 34 +d 52+d 54)/(3 x 2) =(9+6+7+9+10+8)/6 =8. 17 n 9/3/2021 Update the distance matrix {135} {24} {135} 0 {24} 8. 17 0 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 41
Hierarchical Clustering n Average Linkage Method n Distance n n 9/3/2021 2 5 7 9 Clusters {35}, {1}, {2}, {4} {35}, {1}, {24} {135}, {24} {12345} Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 42
Hierarchical Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 43
Hierarchical Clustering n Example Manly (2005) Distance Matrix of 5 objects 1 2 3 4 5 9/3/2021 1 0 2 6 10 9 2 3 4 5 0 5 9 8 0 4 5 0 3 0 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 44
Hierarchical Clustering n Single Linkage Method n Distance Clusters 2 {12}, {3}, {4}, {5} n 3 {12}, {3}, {45} n 4 {12}, {345} n 5 {12345} Same results are obtained from complete and average linkage methods n 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 45
9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 46
Hierarchical Clustering Example: Canine group by single linkage clustering Distance Clusters # 0. 72 {MD, PD}, GJ, CW, IW, CU, DI 6 1. 38 {MD, PD, CU}, GJ, CW, IW, DI 5 1. 63 {MD, PD, CU}, GJ, CW, IW, DI 5 1. 68 {MD, PD, CU, DI}, GJ, CW, IW 4 2. 07 {MD, PD, CU, DI, GJ}, CW, IW 3 2. 31 {MD, PD, CU, DI, GJ}, {CW, IW} 2 2. 37 {MD, PD, CU, DI, GJ, CW, IW} 1 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 47
9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 48
Results of single linkage method for European employment data 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 49
Hierarchical Clustering n Centroid (Center or Average) Method n n n 9/3/2021 Start with each object being a cluster Merge the two clusters with the shortest distance Compute the centroid as the average of all variables in the new cluster and update the distance matrix using the averages of the new clusters Merge the two clusters with the shortest distance Compute the centroid as the averages of all variables in the new clusters and update the distance matrix using the averages of the new clusters Copyright by Jen-pei and Repeat above steps until. Liu, it. Ph. D forms one cluster Wei-Chu Chie, MD, Ph. D 50
Introduction Example Student 1 2 3 4 5 6 9/3/2021 Chinese (X 1) Math (X 2) 85 82 25 32 65 55 90 95 40 30 60 70 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 51
Hierarchical Clustering n Centroid Method Euclidean Distance matrix of 6 students 1 2 3 4 5 6 9/3/2021 1 0 78. 10 33. 60 13. 93 68. 77 27. 73 2 3 4 5 0 46. 14 90. 52 15. 13 51. 66 0 47. 17 0 36. 36 82. 01 0 15. 81 39. 05 44. 72 0 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 6 52
Hierarchical Clustering n Centroid Method n n n 9/3/2021 The shortest distance is between student {1} and student {4} Merge {1} and {4} into {14} Compute the averages for Chinese and math Average of Chinese = (85+90)/2 = 87. 5 Average of math = (82+95)/2=88. 5 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 53
Hierarchical Clustering n Centroid Method n Update the Euclidean distance matrix {14} 2 3 5 6 9/3/2021 {14} 0 84. 25 40. 35 75. 36 33. 14 2 3 5 6 0 46. 14 0 15. 13 35. 36 0 51. 56 15. 81 44. 72 0 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 54
Hierarchical Clustering n Centroid Method n n 9/3/2021 The shortest distance is between {2} and {5} Merge {2} and {5} into {35} The average of Chinese of {35} is 32. 5 The average of math of {35} is 31. 0 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 55
Hierarchical Clustering n Centroid Method n Update the Euclidean distance matrix {14} {25} 3 6 9/3/2021 {14} 0 79. 57 40. 35 33. 14 {25} 3 6 0 40. 40 0 44. 72 15. 81 0 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 56
Hierarchical Clustering n Centroid Method n n 9/3/2021 The shortest distance is between {3} and {6} Merge {3} and {6} into {36} The average of Chinese of {36} is 62. 5 The average of math of {36} is 62. 5 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 57
Hierarchical Clustering n Centroid Method n Update the Euclidean distance matrix {14} {25} {36} 9/3/2021 {14} {25} {36} 0 79. 57 0 36. 07 43. 50 0 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 58
Hierarchical Clustering n Centroid Method The shortest distance is between {14} and {36} n Merge {14} and {36} into {1346} n Cluster means Cluster Chinese Math {25} 32. 5 31. 0 (1346} 75. 0 75. 5 n 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 59
Hierarchical Clustering n Centroid Method n n Distance between {25} and {1346} is 61. 53 Distance Clusters n n n 9/3/2021 13. 93 15. 13 15. 81 36. 07 61. 53 {14}, {2}, {3}, {5}, {6} {14}, {25}, {36} {1436}, {25} {123456} Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 60
Hierarchical Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 61
Hierarchical Clustering n Application to gene expression data from microarray experiments n n # of genes >>> # of subjects Clustering in two directions n n 9/3/2021 Clusters of subjects (patients) Clusters of genes Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 62
9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 63
9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 64
Hierarchical Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 65
Hierarchical Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 66
Hierarchical Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 67
Hierarchical Clustering n n The complexity of a bottom-up method can vary between n 2 and n 3 depend on the linkage chosen. The complexity of a top-down method can vary between nlogn and n 2 depend on the linkage chosen. 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 68
Hierarchical Clustering n Determination of the number of clusters n Criteria n n 9/3/2021 Root-mean-square total-sample standard deviation (RMSSTD) Semipartial R-square (SPRSQ) R-square (RSQ) Minimum distance (MD) Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 69
9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 70
Hierarchical Clustering n Determination of the number of clusters Example: test scores of 6 students # of clusters RMSSTD SPRSQ 5 6. 96 0. 0145 4 7. 57 0. 0171 3 7. 91 0. 0187 2 15. 93 0. 1946 1 25. 86 0. 7751 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D RSQ 0. 98 0. 97 0. 95 0. 76 0. 00 MD 0. 30 0. 33 0. 34 0. 60 0. 77 71
K-means Clustering n n n Step 1: Select the number of clusters, say K and determine the distance measure such as Euclidean distance or 1 -Pearson correlation coefficient Step 2: Divide n objects into K clusters, either randomly or based on a preliminary hierarchical clustering Step 3: Compute the centroids of each clusters and calculate the distances of each object to centroids of all clusters 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 72
K-means Clustering n n n Step 4: For each object, find the minimal distance and reallocate the object to the corresponding cluster with the minimal distance Step 5: Update the clusters and its centroids Step 6: Repeat Step 3 and Step 4 until no reallocation of objects among clusters occurs 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 73
K-means Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 74
K-means Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 75
K-means Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 76
K-means Clustering 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 77
K-means Clustering n The number of computations that need to be performed can be written as c*p where c is a value that does depend on the number of iterations and p is the number of variables (e. g. , the number of genes) 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 78
K-means Clustering n n The number of clusters is selected to maximize the between-cluster sum of squares (variation) and to minimize the within-cluster sum of squares (variation) The best-of-10 partition: to apply K-means method 10 times using 10 different randomly chosen sets of initial clusters and choose the result that minimizes the within-cluster sum of squares 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 79
Issues and Limitations n n With considerable overlap between the initial groups, cluster analysis may produce a result that is quite different from the true situation Different approaches obtained different results. The dendrogram itself is almost never the answer to the research question. Hierarchical diagrams convey information only in their topology 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 80
9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 81
Issues and Limitations n Shape of clusters will create difficulty in cluster analysis n n n 9/3/2021 (a) and (b) by any reasonable algorithms (c) some methods will fail because of overlapping points (d), (e) and (f): great challenges for most of clustering algorithms Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 82
Issues and Limitations n n Anything can be clustered The clustering algorithm applied to the same data may produce different results Ignore the magnitudes of distance measures in dendrogram Position of the patterns with the clusters does not reflect their relationship in the input space 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 83
9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 84
9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 85
9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 86
Summary n n Goals Methods n Hierarchical Methods n n n Single Complete Average Centroid K-means Clutering Limitations 9/3/2021 Copyright by Jen-pei Liu, Ph. D and Wei-Chu Chie, MD, Ph. D 87
- Slides: 87