Authors Rahul Sami 2009 License Unless otherwise noted

  • Slides: 37
Download presentation
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under

Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial Share Alike 3. 0 License: http: //creativecommons. org/licenses/by-nc-sa/3. 0/ We have reviewed this material in accordance with U. S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The citation key on the following slide provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact open. michigan@umich. edu with any questions, corrections, or clarification regarding the use of content. For more information about how to cite these materials visit http: //open. umich. edu/education/about/terms-of-use.

Citation Key for more information see: http: //open. umich. edu/wiki/Citation. Policy Use + Share

Citation Key for more information see: http: //open. umich. edu/wiki/Citation. Policy Use + Share + Adapt { Content the copyright holder, author, or law permits you to use, share and adapt. } Public Domain – Government: Works that are produced by the U. S. Government. (USC 17 § 105) Public Domain – Expired: Works that are no longer protected due to an expired copyright term. Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain. Creative Commons – Zero Waiver Creative Commons – Attribution License Creative Commons – Attribution Share Alike License Creative Commons – Attribution Noncommercial Share Alike License GNU – Free Documentation License Make Your Own Assessment { Content Open. Michigan believes can be used, shared, and adapted because it is ineligible for copyright. } Public Domain – Ineligible: Works that are ineligible for copyright protection in the U. S. (USC 17 § 102(b)) *laws in your jurisdiction may differ { Content Open. Michigan has used under a Fair Use determination. } Fair Use: Use of works that is determined to be Fair consistent with the U. S. Copyright Act. (USC 17 § 107) *laws in your jurisdiction may differ Our determination DOES NOT mean that all uses of this 3 rd-party content are Fair Uses and we DO NOT guarantee that your use of the content is Fair. To use this content you should do your own independent analysis to determine whether or not your use will be Fair.

Lecture 12: Explanations; Scalable Implementation; Manipulation SI 583: Recommender Systems si. umich. edu SCHOOL

Lecture 12: Explanations; Scalable Implementation; Manipulation SI 583: Recommender Systems si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

4 Explanations in recommender systems n Moving away from the black-box oracle model n

4 Explanations in recommender systems n Moving away from the black-box oracle model n justify why a certain item is recommended n maybe also converse to reach a recommendation si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

5 Amazon. com si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

5 Amazon. com si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

Amazon explanations (contd. ) Amazon. com si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF

Amazon explanations (contd. ) Amazon. com si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN 6

7 Why have explanations? [Tintarev & Masthoff] n n n Transparency “Scrutability”: correct errors

7 Why have explanations? [Tintarev & Masthoff] n n n Transparency “Scrutability”: correct errors in learnt preference model Trust/Confidence in system Effectiveness & efficiency(speed) Satisfaction/enjoyment si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

8 Example: explanations for transparency and confidence n “Movie X was recommended to you

8 Example: explanations for transparency and confidence n “Movie X was recommended to you because it is similar to movie Y, Z that you recently watched” n “Movie X was recommended to you because you liked other comedies” n “Other users who bought book X also bought book Y” si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

9 Generating explanations n Essentially, explain the steps of the CF algorithm, picking the

9 Generating explanations n Essentially, explain the steps of the CF algorithm, picking the most prominent “neighbors” – User-user – Item-item n Harder to do for SVD and other abstract model-fitting recommender algorithms si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

10 Conversational recommenders Example transcript: (from [Mc. Sherry, “Explanation in Recommender Systems, AI Review

10 Conversational recommenders Example transcript: (from [Mc. Sherry, “Explanation in Recommender Systems, AI Review 2005]): n n n n Top case: please enter your query User: Type = wandering, month = aug Top Case: the target case is “aug, tyrol, . . . ” other competing cases include “. . ” Top case: What is the preferred location? User: why? Top case: It will help eliminate. . . alternatives User: alps. . si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

11 Conversational recommenders n One view: CF using some navigational data as well as

11 Conversational recommenders n One view: CF using some navigational data as well as ratings n More structured approach: incremental collaborative filtering – similarity metric changes as the query is refined n e. g. , incremental Nearest-Neighbor algorithm [Mc. Sherry, AI Review 2005] si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

12 Scalable Implementations n Learning objective: – see some techniques that are used for

12 Scalable Implementations n Learning objective: – see some techniques that are used for large-scale recommenders – Know where to start looking for more information si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

13 Google News Personalization [Das et al, WWW’ 07] describe algo. and arch. n

13 Google News Personalization [Das et al, WWW’ 07] describe algo. and arch. n Specific challenges: News – relevant items are frequently changing – users long-lived, but often new users – Very fast response times needed n Specific challenges: Google – scale! many items, many users – need to parallelize complex computations si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

14 Algorithms n Input data: clicks – eg, “user J clicked on article X”

14 Algorithms n Input data: clicks – eg, “user J clicked on article X” n Use a combination of three reco algos: – user-user (with a simple similarity measure) – SVD (“PLSI”) – Item-item (mainly for new users; simple covisitation similarity measure) si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

15 Tricks/approximations for scalable computing n User-user: calculate weighted avg. over only a cluster

15 Tricks/approximations for scalable computing n User-user: calculate weighted avg. over only a cluster of users – J and K in same cluster if they have a high fraction of overlapped clicks – clustering is precomputed offline (using a fast Min. Hash algorithm) n SVD : Precompute user-side weights; update only item-side weights in real time – gives an approximate SVD n Tweak offline algorithms for parallel computing on Google’s map-reduce infrastructure si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

16 Architecture (from Das et al) Das et al. si. umich. edu SCHOOL OF

16 Architecture (from Das et al) Das et al. si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

17 Experiences with the Netflix prize challenge n Difference: static dataset n My “architecture”

17 Experiences with the Netflix prize challenge n Difference: static dataset n My “architecture” (such as it was): – A clustered user-user • randomly chosen clusters (not optimal) • cluster size to fit user-user calc in 1 GB memory – – Preprocess, create indices (perl scripts) Calculate similarities (in C) {memory bottleneck} Generate predictions (perl) Evaluate accuracy on test set (perl) si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

18 Manipulation. . si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

18 Manipulation. . si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

19 Why manipulate a recommender? n Examples? si. umich. edu SCHOOL OF INFORMATION UNIVERSITY

19 Why manipulate a recommender? n Examples? si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

20 Why manipulate a recommender? n Examples? – Digg/Slashdot: get an article read –

20 Why manipulate a recommender? n Examples? – Digg/Slashdot: get an article read – Page. Rank: get your site high on results page – Books: Author wants his book recommended – Spam n How? si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

21 item Example: User-User Algorithm A B Joe 7 4 Sue 7 5 John

21 item Example: User-User Algorithm A B Joe 7 4 Sue 7 5 John 2 user . . . C 4 6 3 2 5 7 X 5 ? 6 8 2 • i’s informativeness score = correlation coefficient of i’s past ratings with Joe’s past ratings • Prediction for item X = average of ratings of X, weighted by the rater’s scores si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

22 Cloning Attack: Strategic copying n Attacker may copy past ratings to look informative,

22 Cloning Attack: Strategic copying n Attacker may copy past ratings to look informative, gain influence. 7 4 4 2 5 ? Free. Meds 7 4 4 2 5 10 Joe n Even if ratings are not directly visible, attacker may be able to infer something about ratings from her own recommendations, publicly available statistics n Worse if many accounts can be created (sybil attack) si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

23 One approach: profile analysis n This problem of “shilling attacks” has been noted

23 One approach: profile analysis n This problem of “shilling attacks” has been noted earlier [Lam and Riedl] [O’Mahoney et al] n Many papers on empirical measurements and statistical detection of attack profiles n Problem: attackers may get better at disguising their profiles. si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

24 Results we cannot achieve n Prevent any person J from manipulating the prediction

24 Results we cannot achieve n Prevent any person J from manipulating the prediction on a single item X. – Cannot distinguish deliberate manipulation from different tastes on item X n “Fairness”, ie. , two raters with identical information get exactly the same influence, regardless of rating order. – Cannot distinguish second rater with identical information from an informationless clone. si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

25 The influence limiter: Key Ideas [Resnick and Sami, Proceedings of Rec. Sys ‘

25 The influence limiter: Key Ideas [Resnick and Sami, Proceedings of Rec. Sys ‘ 07 conference] n n Limit influence until rater demonstrates informativeness Informative only if you’re the first to provide the information si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

26 Predictions on an Item: A Dynamic View 0 predicted probability of HIGH 1

26 Predictions on an Item: A Dynamic View 0 predicted probability of HIGH 1 Recommender algorithm ratings si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

27 Predictions on an Item: A Dynamic View 0 predicted probability of HIGH 1

27 Predictions on an Item: A Dynamic View 0 predicted probability of HIGH 1 Recommender algorithm ratings si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

28 Predictions on an Item: A Dynamic View 0 predicted probability of HIGH 1

28 Predictions on an Item: A Dynamic View 0 predicted probability of HIGH 1 Recommender algorithm ratings si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

29 Predictions on an Item: A Dynamic View 0 predicted probability of HIGH eventu

29 Predictions on an Item: A Dynamic View 0 predicted probability of HIGH eventu al label by target: 1 HIGH Recommender algorithm ratings si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

30 Predictions on an Item: A Dynamic View contribution 0 predicted probability of HIGH

30 Predictions on an Item: A Dynamic View contribution 0 predicted probability of HIGH eventu al label by target: 1 HIGH Recommender algorithm ratings si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

31 Our approach n n n Information-theoretic measure of contribution and damage Limit influence

31 Our approach n n n Information-theoretic measure of contribution and damage Limit influence a rater can have had based on past contribution This limits net damage an attacker can cause contribution 0 predicted probability eventu al 1 label Recommender algorithm ratings si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

32 Our Model n n Binary rating system (HIGH/LOW) Recommendations for a single target

32 Our Model n n Binary rating system (HIGH/LOW) Recommendations for a single target person Any recommender algorithm Powerful attackers: – Can create up to n sybil identities – Can “clone” existing rating profiles n No assumptions on non-attackers: – Attacker’s sybils may form majority – Do not depend on honest raters countering attacks si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

33 Overview of Results “Influence-limiter” algorithm can be overlaid on any recommender algorithm to

33 Overview of Results “Influence-limiter” algorithm can be overlaid on any recommender algorithm to satisfy (with caveats): n Limited damage: An attacker with up to n sybils can never cause net total damage greater than O(1) units of prediction error n Bounded information loss: In expectation, O(log n) units of information discarded from each genuine rater in total. si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

34 Influence Limiter: Architecture reputation s Rj Scoring ~ ~ ~ q q 1

34 Influence Limiter: Architecture reputation s Rj Scoring ~ ~ ~ q q 1 q 0 n target rating to target Influence Limiter q 0 q 1 qn Recommender algo ratings si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

35 Influence Limiter Algorithm: Illustration 0 1 limited q~ j-1 prediction Influence Limiter 0

35 Influence Limiter Algorithm: Illustration 0 1 limited q~ j-1 prediction Influence Limiter 0 raw predictions 1 qj-1 Recommender algorithm ratings si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

36 Influence Limiter Algorithm: Illustration A rater with R=0. 25 puts in a rating

36 Influence Limiter Algorithm: Illustration A rater with R=0. 25 puts in a rating 0 limited q~j-1 prediction Influence Limiter raw qj-1 qj predictions 0 1 1 Recommender algorithm ratings si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN

37 Influence Limiter Algorithm: Illustration A rater with R=0. 25 puts in a rating

37 Influence Limiter Algorithm: Illustration A rater with R=0. 25 puts in a rating 0 ~ limited q~j-1 q j prediction Influence Limiter raw qj-1 qj predictions 0 1 1 Recommender algorithm ratings si. umich. edu SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN