Clustering Unsupervised learning introduction Machine Learning Supervised learning

  • Slides: 29
Download presentation
Clustering Unsupervised learning introduction Machine Learning

Clustering Unsupervised learning introduction Machine Learning

Supervised learning Training set: Andrew Ng

Supervised learning Training set: Andrew Ng

Unsupervised learning Training set: Andrew Ng

Unsupervised learning Training set: Andrew Ng

Applications of clustering Market segmentation Social network analysis Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of

Applications of clustering Market segmentation Social network analysis Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison) Organize computing clusters Astronomical data analysis Andrew Ng

Clustering Machine Learning K-means algorithm

Clustering Machine Learning K-means algorithm

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

Andrew Ng

K-means algorithm Input: (number of clusters) - Training set (drop convention) Andrew Ng

K-means algorithm Input: (number of clusters) - Training set (drop convention) Andrew Ng

K-means algorithm Randomly initialize cluster centroids Repeat { for = 1 to : =

K-means algorithm Randomly initialize cluster centroids Repeat { for = 1 to : = index (from 1 to ) of cluster centroid closest to for = 1 to : = average (mean) of points assigned to cluster } Andrew Ng

K-means for non-separated clusters Weight T-shirt sizing Height Andrew Ng

K-means for non-separated clusters Weight T-shirt sizing Height Andrew Ng

Clustering Optimization objective Machine Learning

Clustering Optimization objective Machine Learning

K-means optimization objective = index of cluster (1, 2, …, ) to which example

K-means optimization objective = index of cluster (1, 2, …, ) to which example assigned = cluster centroid ( ) = cluster centroid of cluster to which example assigned Optimization objective: is currently has been Andrew Ng

K-means algorithm Randomly initialize cluster centroids Repeat { for = 1 to : =

K-means algorithm Randomly initialize cluster centroids Repeat { for = 1 to : = index (from 1 to ) of cluster centroid closest to for = 1 to : = average (mean) of points assigned to cluster } Andrew Ng

Clustering Random initialization Machine Learning

Clustering Random initialization Machine Learning

K-means algorithm Randomly initialize cluster centroids Repeat { for = 1 to : =

K-means algorithm Randomly initialize cluster centroids Repeat { for = 1 to : = index (from 1 to ) of cluster centroid closest to for = 1 to : = average (mean) of points assigned to cluster } Andrew Ng

Random initialization Should have Randomly pick examples. Set examples. training equal to these Andrew

Random initialization Should have Randomly pick examples. Set examples. training equal to these Andrew Ng

Local optima Andrew Ng

Local optima Andrew Ng

Random initialization For i = 1 to 100 { Randomly initialize K-means. Run K-means.

Random initialization For i = 1 to 100 { Randomly initialize K-means. Run K-means. Get Compute cost function (distortion) . } Pick clustering that gave lowest cost Andrew Ng

Clustering Choosing the number of clusters Machine Learning

Clustering Choosing the number of clusters Machine Learning

What is the right value of K? Andrew Ng

What is the right value of K? Andrew Ng

Choosing the value of K Cost function Elbow method: 1 2 3 4 5

Choosing the value of K Cost function Elbow method: 1 2 3 4 5 6 (no. of clusters) 7 8 1 2 3 4 5 6 7 8 (no. of clusters) Andrew Ng

Choosing the value of K Sometimes, you’re running K-means to get clusters to use

Choosing the value of K Sometimes, you’re running K-means to get clusters to use for some later/downstream purpose. Evaluate K-means based on a metric for how well it performs for that later purpose. E. g. T-shirt sizing Weight T-shirt sizing Height Andrew Ng