Matching Users and Items Across Domains to Improve

  • Slides: 18
Download presentation
Matching Users and Items Across Domains to Improve the Recommendation Quality Chung-Yi Li, Shou-De

Matching Users and Items Across Domains to Improve the Recommendation Quality Chung-Yi Li, Shou-De Lin r 00922051@csie. ntu. edu. tw sdlin@csie. ntu. edu. tw Department of Computer Science and Information Engineering, National Taiwan University 1

Motivation 2 2 Lack of data is a serious concern in building a recommender

Motivation 2 2 Lack of data is a serious concern in building a recommender system, in particular for newly established services. Can we leverage the information from other domains to improve the quality of a recommender system?

Problem Definition 3 Given: Two homogeneous rating matrices They model the same type of

Problem Definition 3 Given: Two homogeneous rating matrices They model the same type of preference. Decent portion of overlap in users and in items. Challenge: The mapping of users is unknown, and so is the mapping of items. Goals: 1. Identify the user mapping and item mapping. 2. Use the identified mappings to boost the recommendation performance. 3 ♫ ♫ ♫ Target Rating Matrix Source Rating Matrix

Why This Problem Is Challenging 4 When item correspondence is known, the problem is

Why This Problem Is Challenging 4 When item correspondence is known, the problem is much easier � Define user similarity. If the similarity is large, they are likely to be the same users. [Narayanan 2008] ♫ ♫ ♫ In our case, both sides are unknown no clear solution yet 4

Basic Idea 5 low rank assumption and factorization models R 1 m 1 m

Basic Idea 5 low rank assumption and factorization models R 1 m 1 m 2 m 3 m 4 m 5 5 n 4 n 3 n 2 n 1 = m 5 m 4 m 3 m 2 m 1 = n n 2 4 1 3 R 2 ?

A Two-Stage Model to Find the Matching 6 M 1×N 1 ? O O

A Two-Stage Model to Find the Matching 6 M 1×N 1 ? O O ? ≈ M 1×M 2 M 2×N 2 N 2×N 1 O ? ? O 1. Latent Space Matching Rough Matching Result 2. Matching Refinement 6 Final Matching Result

 Stage 1: Latent Space Matching 7 7 1. Latent Space Matching

Stage 1: Latent Space Matching 7 7 1. Latent Space Matching

8 How can we perform SVD on a Partially Observed Matrix? = = 8

8 How can we perform SVD on a Partially Observed Matrix? = = 8 = 1. Latent Space Matching

Matching in Latent Space 9 We want to solve G from Now we know

Matching in Latent Space 9 We want to solve G from Now we know how to get Thus Since SVD is unique, we can separate user and item sides: Same subproblem 9 S: sign matrix (K by K, diagonal, -1 or 1) 1. Latent Space Matching

Solving 10 0 1 0 ≈ (M 1× K) 10 (M 1× M 2)

Solving 10 0 1 0 ≈ (M 1× K) 10 (M 1× M 2) (M 2× K) 1. Latent Space Matching

 11 More accurate but harder to solve. � Obtain good initialization and reduced

11 More accurate but harder to solve. � Obtain good initialization and reduced search space from latent space matching. Solve Guser and Gitem alternatingly. The objective value always decreases & converges. 11 1. Latent Space Matching Rough Matching Result 2. Matching Refinement Final Matching Result

Goals 12 1. Identify the user mapping and item mapping 1. Latent Space Matching

Goals 12 1. Identify the user mapping and item mapping 1. Latent Space Matching Rough Matching Result 2. Matching Refinement Final Matching Result 2. Then, use the identified mappings to boost recommendation performance 12

13 Transferring Imperfect Matching to Predict Ratings 13 Matched latent factors are constrained to

13 Transferring Imperfect Matching to Predict Ratings 13 Matched latent factors are constrained to be similar

Experiment Setup 14 • Yahoo! Music Dataset items users 14 Disjoint Split Overlap Split

Experiment Setup 14 • Yahoo! Music Dataset items users 14 Disjoint Split Overlap Split Contained Split Subset Split training set of R 1 training set of R 2 Partial Split

15 Accuracy and Mean Average Precision: The higher the better

15 Accuracy and Mean Average Precision: The higher the better

Rating Prediction (Root Mean Square Error) 16 RMSE: the lower the better 16

Rating Prediction (Root Mean Square Error) 16 RMSE: the lower the better 16

(root mean square error)

(root mean square error)

Conclusion 18 It is possible to identify user or item correspondence unsupervisedly based on

Conclusion 18 It is possible to identify user or item correspondence unsupervisedly based on homogeneous rating data Even with imperfect matching, out model can still improve the recommendation accuracy. Questions? 17