Neural Network Homework Report Clustering of the SelfOrganizing

OUTLINE o o o INTRODUCTION CLUSTERING SOM CLUSTERING EXPERIMENTS CONCLUSION

INTRODUCTION o DATA mining processes n n n o problem definition data acquisition. data

CLUSTERING o two main ways approaches n hierarchical approaches o agglomerative algorithm： n o

SOM CLUSTERING o SOM training n first to find the best matching unit (BMU)

o The SOM algorithm characteristic： n n n applicable to large data sets. The

EXPERIMENTS o o Tools: SOM_Tool. Box 2. 0 : Data set: clown. dat n

o Methods and Parameters: n Cluster step 1: o o n Training Parameters of

single linkage dendrogram of 323 SOM Map unit

SOM Map average linkage dendrogram of 323 SOM Map unit

complete linkage dendrogram of 323 SOM Map unit

Slides: 20

Download presentation

Neural Network Homework Report: Clustering of the Self-Organizing Map IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 11, NO. 3, MAY 2000 Professor：Hahn-Ming Lee Student : Hsin-Chung Chen M 9315928

OUTLINE o o o INTRODUCTION CLUSTERING SOM CLUSTERING EXPERIMENTS CONCLUSION

INTRODUCTION o DATA mining processes n n n o problem definition data acquisition. data preprocessing and survey data modeling evaluation. knowledge deployment. Self-organization map feature: n n Dimensionality reduction of unsupervised learning Can applied in deal huge amounts of sample The original data set is represented using a smaller set of prototype vectors not to find an optimal clustering but to get good

CLUSTERING o two main ways approaches n hierarchical approaches o agglomerative algorithm： n o divisive algorithm： n n top-down strategies to build a hierarchical clustering tree partitive approaches o o bottom-up strategies to build a hierarchical clustering tree k-means optimal clustering is a partitioning n n minimizes distances within maximizes distances between clusters

CLUSTERING(cont. )

SOM CLUSTERING o SOM training n first to find the best matching unit (BMU) n the prototype vectors are updated.

o The SOM algorithm characteristic： n n n applicable to large data sets. The computational complexity scales linearly with the number of data samples it does not require huge amounts of memory that basically just the prototype vectors and the current training vector.

EXPERIMENTS o o Tools: SOM_Tool. Box 2. 0 : Data set: clown. dat n Data set (“clown. data”) consisted of 2220 2 -D samples. o o o cluster with three subclusters (right eye) spherical cluster (left eye) elliptical cluster (nose) nonspherical cluster (U-shaped: mouth) large and sparse cluster (body) noise. (such as black x)

o Methods and Parameters: n Cluster step 1: o o n Training Parameters of the SOM's Map size: 19 x 17 Initial Neighborhood Widths: Rough Phases σ1(0): 10 Fine-Tuning Phases σ2(0): 2 learning rates: (The learning rate decreased linearly to zero during the training) Rough Phases : 0. 5 Fine-Tuning Phases 0. 05 Cluster step 2: o Method: K-Means Using 100 Runs

Experimental Results

single linkage dendrogram of 323 SOM Map unit

SOM Map average linkage dendrogram of 323 SOM Map unit

complete linkage dendrogram of 323 SOM Map unit

Conclusion