National Yunlin University of Science and Technology PoissonBased
國立雲林科技大學 National Yunlin University of Science and Technology Poisson-Based Self-Organizing Feature Maps and Hierarchical Clustering for Serial Analysis of Gene Expression Data Presenter : Shao-Wei Cheng Authors : Haiying Wang, Huiru Zheng, and Francisco Azuaje TCBB 2007 Intelligent Database Systems Lab
Outline n Motivation n Objective n Methodology n Experiments and Results n Conclusion n Personal Comments N. Y. U. S. T. I. M. 2 Intelligent Database Systems Lab
Motivation N. Y. U. S. T. I. M. n Serial analysis of gene expression (SAGE) n Poisson. C is an adaptation the k-means. But Poisson. C fails to provide users with a platform to explore complex relationships between clusters. n SOM can be used for visualization. But SOM has shown poor performance on SAGE data analysis by euclidean distance. Intelligent Database Systems Lab
Objectives n To implement and evaluate an adaptation of the SOM algorithm (Poisson. S) incorporates Poisson-statistics-based distance function into the SOM learning process to support SAGE data analysis. p n N. Y. U. S. T. I. M. To integrate Poisson-based distance into a hierarchical clustering system (Poisson. HC) Be combined with Poisson. S to further improve pattern discovery and visualization for large SAGE data sets. p 4 Intelligent Database Systems Lab
Methodology n Poisson distribution n Poisson. S N. Y. U. S. T. I. M. Intelligent Database Systems Lab
Methodology n N. Y. U. S. T. I. M. Poisson. HC Intelligent Database Systems Lab
Experiments n n N. Y. U. S. T. I. M. Data sets p Synthetic Data p Mouse Retinal SAGE Data p Human Cancer SAGE Data Compared algorithms: p SOM with Euclidean distance p SOM with Pearson Correlation Intelligent Database Systems Lab
Experiments n N. Y. U. S. T. I. M. Possion. S p Synthetic Data Euclidean distance Pearson Correlation Intelligent Database Systems Lab
Experiments n N. Y. U. S. T. I. M. Possion. HC p Mouse Retinal SAGE Data Euclidean distance Pearson Correlation Intelligent Database Systems Lab
Experiments n SAGE libraries N. Y. U. S. T. I. M. Possion. HC p Human Cancer SAGE tags Intelligent Database Systems Lab
Experiments n N. Y. U. S. T. I. M. Possion. S + Possion. HC Mouse Retinal SAGE Data Intelligent Database Systems Lab
Conclusion n N. Y. U. S. T. I. M. By incorporating a Poisson statistics-based distance into the SOM learning algorithm and hierarchical clustering techniques, significant improvements in pattern discovery and visualization for SAGE data are accomplished. 12 Intelligent Database Systems Lab
Personal Comments n Advantage p n Incorporates Poisson-statistics-based distance function into the SOM and hierarchical clustering. Drawback p n N. Y. U. S. T. I. M. Can’t determine the optimal number of clusters automatically. Application p Clustering about SAGE data. . 13 Intelligent Database Systems Lab
- Slides: 13