UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU
- Slides: 19
UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU, FINLAND A Heuristic K-means Clustering Algorithm by Kernel PCA Mantao Xu and Pasi Fränti
Problem Formulation Given N data samples X={x 1, x 2, …, x. N}, construct the codebook C = {c 1, c 2, …, c. M} such that mean-square-error is minimized. The class membership p (i) is
Traditional K-Means Algorithm n Iterations of two steps: n n n Characteristics: n n n assignment of each data vector with a class label computation of cluster centroid by averaging all data vectors that are assigned to it Randomized initial partition or codebook Convergence to a local minimum Use of L 2, L 1 and L distance Fast and easy implementation Extensions: n n n Kernel K-means algorithm EM algorithm K-median algorithm
Motivation Investigation on a clustering algorithm that : n performs the conventional K-Means algorithm in searching a solution close to the global optimum. n estimates the initial partition close to the optimal solution n applies a dissimilarity function based on the current partition instead of L 2 distance
Selecting initial partiton Based on kernel feature extraction approach and dynamic programming (DP): 1. Construct 1 -D subspace by kernel PCA. 2. Find a suboptimal partition by DP in the 1 -D subspace. 3. Output the partition of the DP as the initial solution to the K-means algorithm.
Kernel PCA vs. PCA Sovle principal component analysis (PCA) in the reproducing kernel space F, thus implicitly depicts the irregular shape of data with a nonlinear hypercurve: PCA Kernel PCA
Problem formulation of kernel PCA For the kernel PCA for the N data samples, X={xi}, solve the eigenvalue problem in F , which is assumed to be equivalent to: where the eigenvector V is a linear expansion by ans K is the kernel matrix with respect to data X
Dynamic programming in kernel component direction The optimal convex partition Qk={(qj-1, qj]| j=1, , n} in the kernel component direction w can be obtained by dynamic promgramming in terms of either MSE distortion on one-dimensional kernel component subspace (1) or in terms of MSE distortion on original feature space (2)
Application of Delta-MSE Dissimilarity Move vector x from cluster i to cluster j, the change of the MSE function [10] caused by this move is: Delta-MSE(x 4, G 2)=Add. Variance x 2 x 1 y 1 G 1 x 3 G 2 x 4 y 2 y 3 Delta-MSE(x 4, G 1)=Removal. Variance
Pseudocodes of the heuristic K-Means
Four K-Means algorithms used in the experiments n n K-D tree based K-Means: selects initial cluster centroids from the k-bucket centers of a kd-tree structure that is recursively built by PCA-based K-Means: estimate a sub-optimal initial partition by applying the dynamic programming in the PCA direction KPCA-I: the proposed K-Means algorithm based on the dynamic programming criterion (1) LFD-II: the proposed K-Means algorithm based on the dynamic programming criterion (2)
Performance comparison 1 F-ratio validity index values for UCI data sets:
Performance comparison 2 F-ratio validity index values for image data sets:
F-ratio validity index values
F-ratio validity index values
F-ratio validity index values
F-ratio validity index values
Conclusions A new approach to the k-center clustering problem by incorporating the kernel PCA and dynamic programming. The proposed approach in general is superior to the two other algorithms: the PCA-based and the kd-tree based K-Means. Gain in classification performance of the proposed approach increases with the number of clusters in comparison to two others.
Further Work Solving the k-center clustering problem by iteratively incorporating the kernel Fisher discriminant analysis and the dynamic programming technique Solving the k-center clustering problem by boosting a decision function (conduct decision function f over X to obtain a scalar space f(X)).
- Computer science department columbia
- Ucl ridgmount practice
- Northwestern university computer engineering
- Computer science department rutgers
- Meredith hutchin stanford
- Florida state university cs faculty
- Trimentoring
- Bhargavi goswami
- Minna joensuu
- Dhl kouvola
- Shell moottoriöljy valitsin
- Joensuu kouvola
- My favourite subject maths for class 4
- University of phoenix computer science
- University of bridgeport engineering
- Bridgeport engineering department
- Yonsei university computer science
- York university computer science
- Unc chapel hill cs
- Seoul national university computer science