Clustering Methods Part 9 Selforganizing map Pasi Frnti
- Slides: 16
Clustering Methods: Part 9 Self-organizing map Pasi Fränti Speech and Image Processing Unit Department of Computer Science University of Joensuu, FINLAND
SOM main principles • Self-organizing map (SOM) is a clustering method suitable especially for visualization. • Clustering represented by centroids organized in a 1 -d or 2 -d network. • Dimensionality reduction and visualization possibility achieved as side product. • Clustering performed by competitive learning principle.
Self-organizing map Initial configuration M nodes, one for each cluster Nodes connected by network topology (1 -d or 2 -d) Initial locations not important
Self-organizing map Final configuration Node locations adapt during learning stage Network keeps neighbor vectors close to each other Network limits the movement of vectors during learning
SOM pseudo code Le ar sta nin ge g (1/2)
SOM pseudo code Up ce d nt ate ro id s (2/2)
Competitive learning • Each data vector is processed once. • Find nearest centroid: • The centroid is updated by moving it towards the data vector by: • Learning stage similar to k-means but centroid update has different principle.
Learning rate ( ) • Decreases with time movement is large in the beginning but eventually stabilizes. • Linear decrease of weighting: • Exponential decrease of weighting:
Neighborhood (d) Neighboring centroids are also updated: Effect is stronger for nearby centroids:
Weighting of the neighborhood Weighting decreases exponentially
Parameter setup • Number of iterations T – Convergence of SOM is rather slow Should be set as high as possible – Roughly 100 -1000 iterations at minimum. • Size of the initial neighborhood Dmax – Small enough to allow local adaption. – Value D=0 indicates no neighbor structure • Maximum learning rate A – Higher values have mostly random effect. – Most critical are the final stages (D 2) – Optimal choices of A and Dmax highly correlated.
Difficulty of parameter setup Fixing total number of iterations (T Dmax) to 20, 40 and 80. Optimal parameter combination non-trivial.
Adaptive number of iterations • To reduce the effect of parameter setup, should be as high as possible. • Enough time to adapt at the cost of high time complexity. • Adaptive number of iterations: • For Dmax=10 and Tmax=100: Ti = {1, 1, 2, 3, 6, 13, 25, 50, 100}
Example of SOM (1 -d) One cluster missing One cluster too many
Example of SOM (2 -d) (to appear sometime in future)
Literature 1. T. Kohonen, Self-Organization and Associative Memory. Springer. Verlag, New York, 1988. 2. N. M. Nasrabadi and Y. Feng, "Vector quantization of images based upon the Kohonen self-organization feature maps", Neural Networks, 1 (1), 518, 1988. 3. P. Fränti, "On the usefulness of self-organizing maps for the clustering problem in vector quantization", 11 th Scandinavian Conf. on Image Analysis (SCIA’ 99), Kangerlussuaq, Greenland, vol. 1, 415 -422, 1999.