Introduction to Recommender System Guo Guangming guogg goodgmail

  • Slides: 26
Download presentation
Introduction to Recommender System Guo, Guangming guogg. good@gmail. com

Introduction to Recommender System Guo, Guangming guogg. good@gmail. com

Outline • • • Background & Definition Some history worth noting Various applications Main-stream

Outline • • • Background & Definition Some history worth noting Various applications Main-stream approach Evaluation Some resources 2012 -12 -19 Lab of Semantic Computing and Data Mining 2

Outline • Background & Definition – Related areas – Challenges – Paradigms • •

Outline • Background & Definition – Related areas – Challenges – Paradigms • • • Some history worth noting Various applications Main-stream approach Evaluation Some resources 2012 -12 -19 Lab of Semantic Computing and Data Mining 3

Become clear with basic concepts • First step of learning • Building blocks of

Become clear with basic concepts • First step of learning • Building blocks of new ideas • Define the rules to play with • Prerequisites for communication 2012 -12 -19 Lab of Semantic Computing and Data Mining 4

Definition of Recommender Systems • Also named recommendation systems • A subclass of information

Definition of Recommender Systems • Also named recommendation systems • A subclass of information filtering system that seek to predict the 'rating' or 'preference' that a user would give to an item (such as music, books, or movies) or social element (e. g. people or groups) they had not yet considered, using a model built from the characteristics of an item (content-based approaches) or the user's social environment (collaborative filtering approaches). --http: //en. wikipedia. org/wiki/Recommender 2012 -12 -19 Lab of Semantic Computing and Data Mining 5

More truth • Important vertical technique in data mining • One of the most

More truth • Important vertical technique in data mining • One of the most success solution for industry • Became an independent research area in 1990 s – Many highly reputed academic conferences such as SIGIR, KDD, ICML, WWW, EMNLP et al. have it as their subtopics. – Rec. Sys is fully devoted to this area • Data mining/machine learning approach – 1) specifying heuristics that define the utility function and empirically validating its performance – 2) estimating the utility function that optimizes certain performance criterion, such as the mean square error. 2012 -12 -19 Lab of Semantic Computing and Data Mining 6

Chanllenges • • Cold start Long tail Data sparsity Scalability Social & Temporal Context-aware

Chanllenges • • Cold start Long tail Data sparsity Scalability Social & Temporal Context-aware Personality-aware Being accuracy is not enough 2012 -12 -19 Lab of Semantic Computing and Data Mining 7

Related Research Area • • • Cognitive science Text mining Natural Language Processing Information

Related Research Area • • • Cognitive science Text mining Natural Language Processing Information retrieval Machine learning Association mining Approximation theory Management science Consumer choice in marketing 2012 -12 -19 Lab of Semantic Computing and Data Mining 8

Paradigm of Rec. Sys • Content-based recommendations: – recommended items similar to the ones

Paradigm of Rec. Sys • Content-based recommendations: – recommended items similar to the ones the user preferred in the past; • Collaborative recommendations: – recommended items that people with similar tastes and preferences liked in the past; • Knowledge-based recommendations: – recommended items based existing knowledge models that fit the needs of users • Hybrid approaches: – Combination of various input data or/and composition various mechanism 2012 -12 -19 Lab of Semantic Computing and Data Mining 9

Background • Universe Problem in Information Age – – Information overload From SE to

Background • Universe Problem in Information Age – – Information overload From SE to Recsys pull vs. push Web 1. 0 vs. web 2. 0 • Leverage the existing user generated data – User profile – Behavior history on the web, Rating – Click through data, browse data • Great benefits(win-win) – Help users find valuable information – Help business make more profits 2012 -12 -19 Lab of Semantic Computing and Data Mining 10

Outline • Background & Definition • Some history worth noting – Netflix prize •

Outline • Background & Definition • Some history worth noting – Netflix prize • • Various applications Main-stream approach Evaluation Some resources 2012 -12 -19 Lab of Semantic Computing and Data Mining 11

A peak in the history • Research on collaborative filtering algorithm reached a peak

A peak in the history • Research on collaborative filtering algorithm reached a peak during the Netflix movie recommendation competition • October 2, 2006 ~ September 21, 2009 • RMSE – Must outperform baseline by 10% 2012 -12 -19 Lab of Semantic Computing and Data Mining 12

The Million Dollar Programming Prize • The Netflix Prize – Greatly energize the research

The Million Dollar Programming Prize • The Netflix Prize – Greatly energize the research in Recsys – Last from 2006 to 2009 • Finalist: Bell. Kor’s Pragamatic Chaos team – A joint-team – Andreas Töscher and Michael Jahrer ( Commendo Research &Consulting Gmb. H), originally team Big. Chaos – Robert Bell, and Chris Volinsky (AT& T), Yehuda Koren (Yahoo), originally team Bell. Kor – Martin Piotte and Martin Chabbert, originally team Pragmatic Theory • The ensemble Team – The most accurate algorithm in 2007 used an ensemble method of 107 different algorithmic approaches 2012 -12 -19 Lab of Semantic Computing and Data Mining 13

Outline • • • Background & Definition Some history worth noting Various applications Main-stream

Outline • • • Background & Definition Some history worth noting Various applications Main-stream approach Evaluation Some resources 2012 -12 -19 Lab of Semantic Computing and Data Mining 14

Existing applications • • News/Article recommendation Targeted Advertisement Tags Recommendation Mobile Recommendation • E-commerce

Existing applications • • News/Article recommendation Targeted Advertisement Tags Recommendation Mobile Recommendation • E-commerce – Books, movies, music… 2012 -12 -19 Lab of Semantic Computing and Data Mining 15

Benefits • Alternative to Search Engine • Boost the profit – Amazon et al.

Benefits • Alternative to Search Engine • Boost the profit – Amazon et al. • Better user experience 2012 -12 -19 Lab of Semantic Computing and Data Mining 16

Outline • • Background & Definition Some history worth noting Various applications Main-stream approach

Outline • • Background & Definition Some history worth noting Various applications Main-stream approach – Content-based – Collaborative filtering • Evaluation • Some resources 2012 -12 -19 Lab of Semantic Computing and Data Mining 17

Content-based • Simple compute the similarity – Cosine similarity or pearson correlation coefficient –

Content-based • Simple compute the similarity – Cosine similarity or pearson correlation coefficient – TF-IDF • Utilize dimensionality reduction – LDA 2012 -12 -19 Lab of Semantic Computing and Data Mining 18

Collaborative filtering • Association mining • Memory-based – Nearest-neighbors • Model-based – Latent fator

Collaborative filtering • Association mining • Memory-based – Nearest-neighbors • Model-based – Latent fator model • Some comparison – Space & time – Theory foundation and interpretability 2012 -12 -19 Lab of Semantic Computing and Data Mining 19

Latent factor model • 2012 -12 -19 Lab of Semantic Computing and Data Mining

Latent factor model • 2012 -12 -19 Lab of Semantic Computing and Data Mining 20

Computations • 2012 -12 -19 Lab of Semantic Computing and Data Mining 21

Computations • 2012 -12 -19 Lab of Semantic Computing and Data Mining 21

Outline • • • Background & Definition Some history worth noting Various applications Main-stream

Outline • • • Background & Definition Some history worth noting Various applications Main-stream approach Evaluation Some resources 2012 -12 -19 Lab of Semantic Computing and Data Mining 22

Evaluation Criterion • User satisfaction by quesionnaire • Precision – RMSE – Top-k •

Evaluation Criterion • User satisfaction by quesionnaire • Precision – RMSE – Top-k • • Coverage Diversity Novelty Serendipity – Originally thinking recommendation has non-sense • … 2012 -12 -19 Lab of Semantic Computing and Data Mining 23

Outline • • • Background & Definition Some history worth noting Various applications Main-stream

Outline • • • Background & Definition Some history worth noting Various applications Main-stream approach Evaluation Some resources 2012 -12 -19 Lab of Semantic Computing and Data Mining 24

葫芦项亮 2012 -12 -19 Lab of Semantic Computing and Data Mining 25

葫芦项亮 2012 -12 -19 Lab of Semantic Computing and Data Mining 25

Resources • www. recsyswiki. com • 各大推荐引擎资料汇总 by 大魁 – http: //blog. csdn. net/lzt

Resources • www. recsyswiki. com • 各大推荐引擎资料汇总 by 大魁 – http: //blog. csdn. net/lzt 1983/article/details/7914536 2012 -12 -19 Lab of Semantic Computing and Data Mining 26