Selftaught Clustering an instance of Transfer Unsupervised Learning
- Slides: 27
Self-taught Clustering – an instance of Transfer Unsupervised Learning †Wenyuan Dai joint work with ‡Qiang Yang, †Gui-Rong Xue, and †Yong Yu †Shanghai Jiao Tong University ‡Hong Kong University of Science and Technology
Outline � Motivation � Self-taught � Transfer Clustering Unsupervised Learning � Algorithm � Experiments � Conclusion
Outline � Motivation � Self-taught � Transfer Clustering Unsupervised Learning � Algorithm � Experiments � Conclusion
Clustering relies on the sufficiency of data?
When the data are sparse, …
Can sparse data be clustered well? � Sometimes, it is possible. A good data representation can make the clustering much easier.
A good representation may help transformation
Can Transfer Learning help? auxiliary data target data
Transfer Learning can help auxiliary data separate the auxiliary data via transformation target data
Outline � Motivation � Self-taught � Transfer Clustering Unsupervised Learning � Algorithm � Experiments � Conclusion
Problem Definition � Target �a Data small collection; to be clustered � Auxiliary �a Data large amount of; be irrelevant to the target data � Objective � enhance the clustering performance on the target data by making use of the auxiliary unlabeled data � Self-taught Learning Clustering/Transfer Unsupervised
Self-taught Learning training data auxiliary data test data (Raina et al. , ICML 2007)
Self-taught Clustering auxiliary data target data
Transfer Learning training data auxiliary data test data (Raina et al. , ICML 2006)
Transfer Unsupervised Learning auxiliary data target data
Outline � Motivation � Self-taught � Transfer Clustering Unsupervised Learning � Algorithm � Experiments � Conclusion
Self-taught Clustering via Co-clustering � Target �a Data small collection; to be clustered � Auxiliary �a Data large amount of; be irrelevant to the target data coclustering co-clustering
Objective Function � Notations = {x 1, . . . , xn} and Y ={y 1, . . . , ym} : X and Y correspond to the target and auxiliary data. � Z = {z 1, . . . , zk} corresponds to the feature space of both target and auxiliary data. � represent the clusterings on X, Y and Z �X � Object function for self-taught clustering: Information Theoretic Co-clustering (Dhillon et al. , KDD 2003)
Optimization choose the best minimize for z to
Optimization � Clustering target data auxiliary data features functions (iteratively)
Outline � Motivation � Self-taught � Transfer Clustering Unsupervised Learning � Algorithm � Experiments � Conclusion
Data Sets � Caltech-256 � 20 image corpus categories � eyeglass, sheet-music, airplane, ostrich, fern, starfish, guitar, laptop, hibiscus, ketch, cake, harp, car-side, tire, frog, cd, comet, vcr, diamond-ring, and skyscraper � For each clustering task, � The data from the corresponding categories are used as target unlabeled data. � The data from the remaining categories are used as the auxiliary unlabeled data.
Evaluation Criterion � Entropy � The entropy for a cluster is defined as total entropy is defined as the weighted sum of the entropy with respect to all the clusters
Experimental Results
Outline � Motivation � Self-taught � Transfer Clustering Unsupervised Learning � Algorithm � Experiments � Conclusion
Conclusion � We investigate the transfer unsupervised learning problem, called self-taught clustering. � Use irrelevant auxiliary unlabeled data to help the target clustering. � We develop a co-clustering based self-taught clustering algorithm. Two co-clusterings are performed simultaneously between target data and features, and between auxiliary data and features. � The two co-clusterings share a common feature clustering. � � The experiments show that our algorithm can improve clustering performance using irrelevant auxiliary unlabeled data.
Question?
- Unsupervised hierarchical clustering
- Transductive learning for unsupervised text style transfer
- Flat vs hierarchical clustering
- Partitional clustering
- Rumus euclidean distance
- Instance based learning in machine learning
- "deep reinforcement learning"
- Perbedaan supervised dan unsupervised classification
- Instance based learning example
- Learning instance occlusion for panoptic segmentation
- Unsupervised learning in data mining
- Autoencoders
- Ann unsupervised learning
- Workspca
- Unsupervised learning
- Andrew ng intro machine learning
- Vas3k machine learning
- Contractive autoencoder
- Supervised and unsupervised learning
- Self-taught learning: transfer learning from unlabeled data
- Cuadro comparativo entre e-learning b-learning y m-learning
- Instance variables
- Instance de socialisation
- L'enfant sauvage socialisation
- Instance weighting for domain adaptation in nlp
- Emr normalized instance hours
- Table instance chart
- Instance of class