Fused Matrix Factorization with Geographical and Social Influence

  • Slides: 25
Download presentation
Fused Matrix Factorization with Geographical and Social Influence in Location-based Social Networks Cheng 1,

Fused Matrix Factorization with Geographical and Social Influence in Location-based Social Networks Cheng 1, Haiqin Yang 1, Irwin King 1, 2 and Michael R. Lyu 1 1 Department of Computer Science and Engineering The Chinese University of Hong Kong 2 ATT & Labs, Research ccheng@cse. cuhk. edu. hk AAAI 2012, Toronto, Canada

Check-in becomes a life style… AAAI 2012, Toronto, Canada

Check-in becomes a life style… AAAI 2012, Toronto, Canada

Check-in becomes a life style… Now the number of users surpasses 20 million corresponding

Check-in becomes a life style… Now the number of users surpasses 20 million corresponding to 2 billion check-ins 1! 1 http: //statspotting. com/2012/04/foursquare-statistics-20 -million-users-2 -billion-check-ins/ AAAI 2012, Toronto, Canada

Graph illustration of Location-based Social Networks (LBSNs) Friend link • Community detection • Link

Graph illustration of Location-based Social Networks (LBSNs) Friend link • Community detection • Link prediction • POI recommendation • Next place prediction Checked in Check in? Check in ? POI (lat, lng) AAAI 2012, Toronto, Canada • Travel sequence detection • Trip recommendation

Our focus: POI recommendation • Help users explore their surroundings • Provide personalized travel

Our focus: POI recommendation • Help users explore their surroundings • Provide personalized travel recommendation • Help 3 rd-party developers provide personalized services – Advertisements – Coupons – Traffic statistics AAAI 2012, Toronto, Canada

Challenges • Large dataset – Crawled from Gowalla from Feb. 2009 to Sep. 2011

Challenges • Large dataset – Crawled from Gowalla from Feb. 2009 to Sep. 2011 – 4, 128, 714 check-ins from 53, 944 users on 367, 149 locations • Only positive data is seen • Sparsity : density of our dataset is only 0. 0208% AAAI 2012, Toronto, Canada

POI recommendation in LBSNs Matrix Factorization can be a promising tool However… Geographical influence

POI recommendation in LBSNs Matrix Factorization can be a promising tool However… Geographical influence is ignored! AAAI 2012, Toronto, Canada

POI recommendation in LBSNs Er… a little far. . AAAI 2012, Toronto, Canada

POI recommendation in LBSNs Er… a little far. . AAAI 2012, Toronto, Canada

Multi-centers and normal distribution • Two centers (home & office) in [Cho et al

Multi-centers and normal distribution • Two centers (home & office) in [Cho et al 2011] • Several centers proposed in our paper AAAI 2012, Toronto, Canada

Multi-centers and normal distribution Similar to [Brockmann 2006, Gonzalez 2008] , we assume each

Multi-centers and normal distribution Similar to [Brockmann 2006, Gonzalez 2008] , we assume each center follow the norm distribution AAAI 2012, Toronto, Canada

Inverse distance rule AAAI 2012, Toronto, Canada

Inverse distance rule AAAI 2012, Toronto, Canada

Social influence • On average, overlap of a user’s check-ins to his friends only

Social influence • On average, overlap of a user’s check-ins to his friends only about 9. 6% 90% users have only 20% common check-ins AAAI 2012, Toronto, Canada

Our proposal • Multi-center Gaussian Model (MGM) to capture geographical influence • Propose a

Our proposal • Multi-center Gaussian Model (MGM) to capture geographical influence • Propose a generalized fused matrix factorization framework to include social and geographical influences • Conduct thorough experiments on large-scale Gowalla dataset AAAI 2012, Toronto, Canada

Multi-center Gaussian model • Recall check-in locations are located around several centers • The

Multi-center Gaussian model • Recall check-in locations are located around several centers • The probability a user visiting a location is inversely proportional to the distance from its nearest center • MGM is proposed to model users’ check-in behavior AAAI 2012, Toronto, Canada

Multi-center Gaussian model • Notation – – – : multi-center set for user u

Multi-center Gaussian model • Notation – – – : multi-center set for user u : total frequency at center for user u is : the pdf of Gaussian distribution, and denote the mean and covariance matrices of regions around center • The probability a user u visiting a location l given defined as: AAAI 2012, Toronto, Canada

Multi-center discovering algorithm A greedy clustering algorithm is proposed due to Pareto principle (top

Multi-center discovering algorithm A greedy clustering algorithm is proposed due to Pareto principle (top 20 locations cover about 80% check-ins) 0. 2 20 search centers AAAI 2012, Toronto, Canada

Fused framework • Traditional Matrix Factorization (MF) only model users’ preference on locations •

Fused framework • Traditional Matrix Factorization (MF) only model users’ preference on locations • MGM only models geographical influence • We can fuse both of them prob. user u visit location l encode user preference based on MF calculated by MGM AAAI 2012, Toronto, Canada

Setup and metric • Split the dataset into 2 non-overlapping sets – Randomly select

Setup and metric • Split the dataset into 2 non-overlapping sets – Randomly select x% for each user as training data and the rest (1 -x)% as the test data, x set to 70 and 80 – Carried out 5 times independently, we report the average • POI recommendation – Return top-N POIs for each user – Find out # of locations in test dataset are recovered • Metric AAAI 2012, Toronto, Canada

Comparison Methods • MGM • PMF: [Salakhutdinov and Mnih 2007] – Assume Gaussian distribution

Comparison Methods • MGM • PMF: [Salakhutdinov and Mnih 2007] – Assume Gaussian distribution on observed data – Gaussian prior on latent feature vector • PMF with Social Regularization (PMFSR): [Ma et al. 2011 b] – Social regularization term added to PMF • Probabilistic Factor Model (PFM): [Ma et al. 2011 a] – Model frequency data, Gamma prior on latent feature vector and Poisson distribution on the frequency data • Fused MF with MGM (FMFMGM): our proposed method AAAI 2012, Toronto, Canada

Results Precision Recall 70% 80% AAAI 2012, Toronto, Canada

Results Precision Recall 70% 80% AAAI 2012, Toronto, Canada

User check-in distribution AAAI 2012, Toronto, Canada

User check-in distribution AAAI 2012, Toronto, Canada

Performance on different users AAAI 2012, Toronto, Canada

Performance on different users AAAI 2012, Toronto, Canada

Conclusion • Extract characteristics of a large dataset crawled from Gowalla • Propose a

Conclusion • Extract characteristics of a large dataset crawled from Gowalla • Propose a novel Multi-center Gaussian Model (MGM) to model geographical influence • Propose a fused MF framework which outperforms state-of-the-art methods AAAI 2012, Toronto, Canada

Future work • To better model one-class frequency data • To include other information:

Future work • To better model one-class frequency data • To include other information: location category, activity, etc. • To incorporate temporal effect AAAI 2012, Toronto, Canada

Thanks Q&A Cheng ccheng@cse. cuhk. edu. hk AAAI 2012, Toronto, Canada

Thanks Q&A Cheng ccheng@cse. cuhk. edu. hk AAAI 2012, Toronto, Canada