Clustering Using Pairwise Comparisons R Srikant ECECSL University
- Slides: 36
Clustering Using Pairwise Comparisons R. Srikant ECE/CSL University of Illinois at Urbana-Champaign
Coauthors Barbara Dembin Siddhartha Satpathi Builds on the work in R. Wu, J. Xu, R. Srikant, L. Massoulie, M. Lelarge, and B. Hajek, Clustering and Inference from Pairwise comparisons (ar. Xiv: 1502. 04631 v 2) 2
Outline • Traditional Noisy Pairwise Comparisons • Our Problem: Clustering users • Algorithm in Prior Work • New Algorithm • Conclusions 3
Noisy pairwise comparisons • Amazon DSLR The user buys this • Item 1 < item 2; item 3 < item 2 • Goal: Infer information about user preferences from such pairwise rankings 4
Bradley-Terry model • 5
Maximum likelihood estimation • 7
Outline • Traditional Noisy Pairwise Comparisons • Our Problem: Clustering users • Algorithm in Prior Work • New Algorithm • Conclusions 8
Clustering Users & Ranking Items • Amazon camera • Different types of users use different score vectors • Cluster users of the same type together, and then estimate the Bradley-Terry parameters for each cluster 9
Generalized Bradley-Terry model • 2020/12/5 10
Questions • We focus on the clustering problem • Once users are clustered, parameter estimation can be performed using other techniques; the results here don’t explicitly depend on the Bradley-Terry model • What is the minimum of samples (pairwise comparisons) needed to cluster the users from pairwise comparison data ? • What algorithm should we use to achieve this limit ? • We will provide answers to these questions in the reverse order 13
Outline • Traditional Noisy Pairwise Comparisons • Our Problem: Clustering users • Algorithm in Prior Work • New Algorithm • Conclusions 14
Net Wins Matrix (1, 2) (1, 3 (3, 4 (1, 4) (2, 3) (2, 4) ) ) 1 0 0 -1 1 -1 0 0 Item 1 2 3 4 0 1 -1 0 0 0 1 -1 15
Why Net Wins Matrix ? • 16
Spectral Clustering 17
Spectral Clustering 18
Spectral Clustering 19
Outline • Traditional Noisy Pairwise Comparisons • Our Problem: Clustering users • Algorithm in Prior Work • New Algorithm • Conclusions 21
Outline of the Algorithm • Split the items into different partitions, and only consider the pairwise comparisons data within each partition (inspired by (Vu, 2014) for community detection) • Apply the previous algorithm to each data partition, and cluster the users based on the information in each partition • Can result in inconsistent clusters: users 1 and 2 may be in the same cluster in one partition, but not in another partition. Which one of these clusters is correct? • Use simple majority voting to correct errors, i. e. , assign the user to the cluster to which it belongs most often
Data Partitioning (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (2, 3) (2, 4) (2, 5) (2, 6) (3, 4) (3, 5) (3, 6) (4, 5) (4, 6) (5, 6) 1 0 0 1 1 0 0 0 0 0 -1 0 0 0 1 1 0 -1 0 0 0 1 0 -1 -1 1 1 -1 -1 (1, 2) (1, 3) (2, 3) (4, 5) (4, 6) (5, 6) 1 0 0 0 -1 0 1 0 1 23
Cluster Users Based on Each Partition Item 1 3 4 18 1 0 1 -1 0 Item 2 5 19 33 1 -1 1 0 0 0 1 0 L Net Wins matrices Partition 1 1 r Spectral clustering Partition L 1 r L different clusterings 24
Numbering the Clusters • Number the clusters 1, 2, … , r arbitrarily in the first data partition • For the second partition, the cluster which overlaps the most with cluster 1 in Partition 1 is called cluster 1, the cluster which overlaps the most with cluster 2 in Partition 1 is called cluster 2, and so on Partition 1 1 2 Partition 2 3 ? ? Partition 3 ? ? ? Partition 4 ? ?
Numbering the Clusters • Number the clusters 1, 2, … , r in the results from the first data partition • For the second partition, the cluster which overlaps the most with cluster 1 in Partition 1 is called cluster 1, the cluster which overlaps the most with cluster 2 in Partition 1 is called cluster 2, and so on Partition 1 1 2 Partition 2 3 3 2 Partition 3 1 1 3 Partition 4 2 2 1 3
Clustering the Users • A user may belong to cluster 1 in one partition, but may belong to some other cluster in another partition • Majority voting determines the correct cluster for each user. Partition 1 1 2 Partition 2 3 1 2 Partition 3 3 1 2 Partition 4 3 1 2 3 27
Summary of the algorithm Partition items uniformly into L sets Partition 1 Item 1 2 3 4 1 -1 0 0 -1 Net Wins matrix Partition L Item 1 2 Partition 1 r 1 Majority voting 1 1 3 4 1 -1 0 0 1 -1 -1 1 0 r Final clustering of users Spectral Clustering Partition L 1 r 28
Main Result • 29
Outline of the Proof: Part I •
Outline of Proof: Part II •
Outline of the Proof: Part III • 32
Lower Bound on Sample Complexity Event A: Two users from different clusters have no pairwise comparisons. If A occurs, all users cannot be clustered correctly. 33
Main Result • 34
Related Work • Vu (2014) • Exact cluster recovery in community detection through spectral methods • Partition data into two sets, use one for clustering and other to correct errors in the recovered clusters • Lu-Negahban (2014) • Bradley-Terry parameters are different for each user, but form a low-rank matrix • Park, Neeman, Zhang, Sanghavi (2015) • Related to the model above, but with a different algorithm • Oh, Thekumparampil, Xu (2015) • Generalization to multi-item rankings 35
Conclusions • 36
- Flat clustering vs hierarchical clustering
- Partitional clustering vs hierarchical clustering
- Flat clustering vs hierarchical clustering
- Ramakrishnan srikant
- Ramakrishnan srikant
- Slashdot
- Ramakrishnan srikant
- Pairwise comparison
- Pairwise independent
- Greedy algorithm
- Pairwise comparison chart is used to___________ objectives.
- Independent event formula
- Types of corelation
- Pairwise disjoint vs disjoint
- Pairwise comparison matrix
- Sharp el-546x
- Disjoint probabilities
- Pairwise key
- Lexical and syntax analysis
- Pairwise blast
- Ebi pairwise alignment
- Pairwise alignment
- Pairwise alignment
- Global vs local alignment
- Listwise vs pairwise
- Pairwise disjointness test
- Everyday is a new beginning
- Pairwise.org
- Pairwise pict
- Pairwise comparison anova
- Tú eres 1 of 1 (less) simpático que federico.
- Simile figure of speech
- Busy superlative
- Pictures to make comparisons
- Equality and inequality ejemplos
- Comparatives of equality and inequality
- Comparative of equality ejemplos