Spectral Clustering Jianping Fan Dept of CS UNCCharlotte

  • Slides: 93
Download presentation
Spectral Clustering Jianping Fan Dept of CS UNC-Charlotte http: //webpages. uncc. edu/jfan/itcs 4122. html

Spectral Clustering Jianping Fan Dept of CS UNC-Charlotte http: //webpages. uncc. edu/jfan/itcs 4122. html

Key issues for Data Clustering Similarity or distance function Inter-cluster similarity or distance Intra-cluster

Key issues for Data Clustering Similarity or distance function Inter-cluster similarity or distance Intra-cluster similarity or distance Number of clusters K Decision for data clustering Intra-cluster distances are minimized Objective Function Inter-cluster distances are maximized

SUMMARY OF K-MEANS Centers: random & density scan l K: start from small K

SUMMARY OF K-MEANS Centers: random & density scan l K: start from small K & separate iteratively; start from large K and merge sequentially l Outliers: l Problems of K-means Locations of Centers Number of Clusters K Sensitive to Outliers Data Manifolds (Shapes of Data Distributions) Experiences

Problems of K-MEANs Intra-cluster distances are minimized Distance Function Optimization Step: Assignment Step: Inter-cluster

Problems of K-MEANs Intra-cluster distances are minimized Distance Function Optimization Step: Assignment Step: Inter-cluster distances are maximized Geometry Distance

Problems of K-MEANs l Similarity function cannot handle special data manifold effectively! l Intra-cluster

Problems of K-MEANs l Similarity function cannot handle special data manifold effectively! l Intra-cluster similarity and inter-cluster similarity are not optimized jointly or simultaneously! l Pre-selected locations of cluster centers may not be acceptable!

K-Means Clustering Expected Why K-Means fails? Achieved

K-Means Clustering Expected Why K-Means fails? Achieved

Why K-Means Clustering Fails? Expected Similarity or distance function Inter-cluster similarity or distance Intra-cluster

Why K-Means Clustering Fails? Expected Similarity or distance function Inter-cluster similarity or distance Intra-cluster similarity or distance Number of clusters K Decision for data clustering Achieved Objective Function

Why K-Means Clustering Fails? Achieved Expected Number of clusters K may not be an

Why K-Means Clustering Fails? Achieved Expected Number of clusters K may not be an issue here Objective function?

Why K-Means Clustering Fails? Expected Achieved Data Manifold: Relationship rather than distance Distance Function

Why K-Means Clustering Fails? Expected Achieved Data Manifold: Relationship rather than distance Distance Function & Decision for Data Clustering

Key issues for Data Clustering Inter-cluster similarity or distance Intra-cluster similarity or distance Number

Key issues for Data Clustering Inter-cluster similarity or distance Intra-cluster similarity or distance Number of clusters K Decision for data clustering Similarity or distance function

Lecture Outline l l Motivation Graph overview and construction Spectral Clustering Cool implementations 11

Lecture Outline l l Motivation Graph overview and construction Spectral Clustering Cool implementations 11

Spectral Clustering Example – 2 Spirals Dataset exhibits complex cluster shapes Þ K-means performs

Spectral Clustering Example – 2 Spirals Dataset exhibits complex cluster shapes Þ K-means performs very poorly in this space due bias toward dense spherical clusters. Relationship vs. Geometry Distance In the embedded space given by two leading eigenvectors, clusters are trivial to separate. 12

Spectral Clustering Similarity representation Relationship Inter-cluster similarity Objective Function Intra-cluster similarity Number of clusters

Spectral Clustering Similarity representation Relationship Inter-cluster similarity Objective Function Intra-cluster similarity Number of clusters K Decision for clustering 13

Graph-Based Similarity Representation ---considering data manifold Relationship vs. Geometry Distance 14

Graph-Based Similarity Representation ---considering data manifold Relationship vs. Geometry Distance 14

Spectral Clustering Example Why k-means fails? Geometry vs. Manifold

Spectral Clustering Example Why k-means fails? Geometry vs. Manifold

Graph-Based Similarity Representation Distance vs. Relationship 16

Graph-Based Similarity Representation Distance vs. Relationship 16

Graph-Based Similarity Representation Distance vs. Relationship 17

Graph-Based Similarity Representation Distance vs. Relationship 17

Graph-Based Similarity Representation Distance vs. Relationship 18

Graph-Based Similarity Representation Distance vs. Relationship 18

Graph-Based Similarity Representation Number of clusters matters 19

Graph-Based Similarity Representation Number of clusters matters 19

Lecture Outline l l Motivation Graph overview and construction Spectral Clustering Cool implementation 20

Lecture Outline l l Motivation Graph overview and construction Spectral Clustering Cool implementation 20

Graph-based Representation of Data Similarity(Relationship) 21

Graph-based Representation of Data Similarity(Relationship) 21

Similarity (Relationship) Graph-based Representation of Data Similarity(Relationship) 22

Similarity (Relationship) Graph-based Representation of Data Similarity(Relationship) 22

Graph-based Representation of Data Relationship 23

Graph-based Representation of Data Relationship 23

Manifold (Shape of Data Distribution) 24

Manifold (Shape of Data Distribution) 24

Graph-based Representation of Data Relationships Manifold 25

Graph-based Representation of Data Relationships Manifold 25

Graph-based Representation of Data Relationships 26

Graph-based Representation of Data Relationships 26

Graph-based Representation of Data Relationships How to generate such graph for data relationship representation?

Graph-based Representation of Data Relationships How to generate such graph for data relationship representation? 27

Data Graph Construction 28

Data Graph Construction 28

29 Graph-based Representation of Data Relationships

29 Graph-based Representation of Data Relationships

30 Graph-based Representation of Data Relationships

30 Graph-based Representation of Data Relationships

31

31

Graph-based Representation of Data Relationships 32

Graph-based Representation of Data Relationships 32

33 Graph-based Representation of Data Relationships

33 Graph-based Representation of Data Relationships

Graph Cut 34

Graph Cut 34

Lecture Outline l l Motivation Graph overview and construction Spectral Clustering---considering intra-cluster similarity and

Lecture Outline l l Motivation Graph overview and construction Spectral Clustering---considering intra-cluster similarity and inter-cluster similarity jointly! Cool implementations 35

Key issues for Spectral Clustering Relationship function for Graph construction Inter-cluster similarity or distance

Key issues for Spectral Clustering Relationship function for Graph construction Inter-cluster similarity or distance Intra-cluster similarity or distance Objective Function Number of clusters K Decision for data clustering

How to Do Graph Partitioning? Citation Group Identification 37

How to Do Graph Partitioning? Citation Group Identification 37

How to Do Graph Partitioning? Social Group Identification 38

How to Do Graph Partitioning? Social Group Identification 38

How to Do Graph Partitioning? Hot Topic Detection 39

How to Do Graph Partitioning? Hot Topic Detection 39

40 Graph-based Representation of Data Relationships

40 Graph-based Representation of Data Relationships

Intra-cluster similarity 41

Intra-cluster similarity 41

Spectral Clustering cut Intra-Cluster Similarity: Inter-Cluster Similarity: 42

Spectral Clustering cut Intra-Cluster Similarity: Inter-Cluster Similarity: 42

Spectral Clustering Graphcut Objective Function for Spectral Clustering 1. Maximize Intra-Cluster Similarity 2. Minimize

Spectral Clustering Graphcut Objective Function for Spectral Clustering 1. Maximize Intra-Cluster Similarity 2. Minimize Inter-Cluster Similarity

Spectral Clustering Graphcut Objective Function for Spectral Clustering Min

Spectral Clustering Graphcut Objective Function for Spectral Clustering Min

Spectral Clustering Graphcut Clustering via Graph Cut on weak connection points: Minimize inter-cluster similarity

Spectral Clustering Graphcut Clustering via Graph Cut on weak connection points: Minimize inter-cluster similarity 45

Inter-cluster similarity 46

Inter-cluster similarity 46

Inter-cluster similarity 47

Inter-cluster similarity 47

48

48

49

49

50

50

Graph-based Representation of Data Relationships 51

Graph-based Representation of Data Relationships 51

Graph Cut 52

Graph Cut 52

53

53

54

54

55

55

56

56

Eigenvectors & Eigenvalues 57

Eigenvectors & Eigenvalues 57

58

58

59

59

Normalized Cut A graph G(V, E) can be partitioned into two disjoint sets A,

Normalized Cut A graph G(V, E) can be partitioned into two disjoint sets A, B Cut is defined as: Optimal partition of the graph G is achieved by minimizing the cut Min ( ) 60

Normalized Cut Association between partition set and whole graph 61

Normalized Cut Association between partition set and whole graph 61

Normalized Cut 62

Normalized Cut 62

Normalized Cut 63

Normalized Cut 63

Normalized Cut 64

Normalized Cut 64

Normalized Cut becomes Normalized cut can be solved by eigenvalue equation: 65

Normalized Cut becomes Normalized cut can be solved by eigenvalue equation: 65

Extending Binary Normalized Cut to Multi-Class 66

Extending Binary Normalized Cut to Multi-Class 66

K-way Min-Max Cut Intra-cluster similarity Inter-cluster similarity Decision function for spectral clustering Minimize inter-cluster

K-way Min-Max Cut Intra-cluster similarity Inter-cluster similarity Decision function for spectral clustering Minimize inter-cluster similarity but maximizing intra-cluster similarity 67

Mathematical Description of Spectral Clustering Refined decision function for spectral clustering We can further

Mathematical Description of Spectral Clustering Refined decision function for spectral clustering We can further define: 68

Refined decision function for spectral clustering This decision function can be solved as 69

Refined decision function for spectral clustering This decision function can be solved as 69

Spectral Clustering Algorithm Ng, Jordan, and Weiss l Motivation l Given a set of

Spectral Clustering Algorithm Ng, Jordan, and Weiss l Motivation l Given a set of points l We would like to cluster them into k subsets 70

Algorithm l l Form the affinity matrix Define if l l Scaling parameter chosen

Algorithm l l Form the affinity matrix Define if l l Scaling parameter chosen by user Define D a diagonal matrix whose (i, i) element is the sum of A’s row i 71

Algorithm l Form the matrix l Find , the k largest eigenvectors of L

Algorithm l Form the matrix l Find , the k largest eigenvectors of L These form the columns of the new matrix X l l Note: have reduced dimension from nxn to nxk 72

Algorithm l Form the matrix Y l l l Renormalize each of X’s rows

Algorithm l Form the matrix Y l l l Renormalize each of X’s rows to have unit length Y Treat each row of Y as a point in Cluster into k clusters via K-means 73

Algorithm l Final Cluster Assignment l Assign point to cluster j iff row i

Algorithm l Final Cluster Assignment l Assign point to cluster j iff row i of Y was assigned to cluster j 74

Why? l If we eventually use K-means, why not just apply K-means to the

Why? l If we eventually use K-means, why not just apply K-means to the original data? l This method allows us to cluster non-convex regions 75

l Some Examples 76

l Some Examples 76

77

77

78

78

79

79

80

80

81

81

82

82

83

83

84

84

User’s Prerogative l Affinity matrix construction l Choice of scaling factor l Realistically, search

User’s Prerogative l Affinity matrix construction l Choice of scaling factor l Realistically, search over and pick value that gives the tightest clusters l Choice of k, the number of clusters l Choice of clustering method 85

How to select k? l Eigengap: the difference between two consecutive eigenvalues. l Most

How to select k? l Eigengap: the difference between two consecutive eigenvalues. l Most stable clustering is generally given by the value k that maximises the expression Largest eigenvalues of Cisi/Medline data Þ Choose k=2 λ 1 λ 2 86

Recap – The bottom line 87

Recap – The bottom line 87

Summary l l Spectral clustering can help us in hard clustering problems The technique

Summary l l Spectral clustering can help us in hard clustering problems The technique is simple to understand The solution comes from solving a simple algebra problem which is not hard to implement Great care should be taken in choosing the “starting conditions” 88

Problems for Spectral Clustering l Number of Clusters K l Objective Function Optimization l

Problems for Spectral Clustering l Number of Clusters K l Objective Function Optimization l Better Similarity (Relationship) Functions 89

What’s Visual Analytics? Initial Clustering Result & Visualization

What’s Visual Analytics? Initial Clustering Result & Visualization

What’s Visual Analytics? Initial Clustering Result & Visualization l l Similarity-preserving data projection: from

What’s Visual Analytics? Initial Clustering Result & Visualization l l Similarity-preserving data projection: from high-dimensional space for data representation to 2 D space for visualization Data layout Mistakes induced by data projection

What’s Visual Analytics? Human Advising via HCI

What’s Visual Analytics? Human Advising via HCI

What’s Visual Analytics? Computer Interpretation of Human Advices Must-Link vs. Not-Link Data Clustering with

What’s Visual Analytics? Computer Interpretation of Human Advices Must-Link vs. Not-Link Data Clustering with Constraints