Factorization Machine Im Jerry Factorization Machine Factorization Methods

  • Slides: 40
Download presentation
Factorization Machine I’m Jerry

Factorization Machine I’m Jerry

Factorization Machine Factorization Methods

Factorization Machine Factorization Methods

Factorization Machine Support Vector Machine

Factorization Machine Support Vector Machine

Factorization Model User Features Item Feature Ratings

Factorization Model User Features Item Feature Ratings

Support Vector Machine (SVM) �D = {(xi , yi) | xi ∈R P, yi

Support Vector Machine (SVM) �D = {(xi , yi) | xi ∈R P, yi ∈{-1, 1}}i = 1~n �Line: y(x) = w‧x + b = 0 �For all yi = 1, y(xi) = w‧xi + b ≧ 1 �For all yi = -1, y(xi) = w‧xi + b ≦ -1 �Minimize |w|

Support Vector Machine (SVM)

Support Vector Machine (SVM)

Recommender Group Y U NO USE SVM?

Recommender Group Y U NO USE SVM?

“Y U NO USE SVM? ” �Real Value V. S. Classification �Sparsity

“Y U NO USE SVM? ” �Real Value V. S. Classification �Sparsity

y(x) = w‧x + b = wu + wi + b

y(x) = w‧x + b = wu + wi + b

Actually We Do Use SVM On Ensemble

Actually We Do Use SVM On Ensemble

Ensemble models User Item Model 1 Model 2 Model 3

Ensemble models User Item Model 1 Model 2 Model 3

Ensemble models User Item Model 1 Model 2 x Model 3 y

Ensemble models User Item Model 1 Model 2 x Model 3 y

Ensemble models User Item Model 2 Model 1 + Model 3 + + =

Ensemble models User Item Model 2 Model 1 + Model 3 + + =

Predictions on train set Train set answer

Predictions on train set Train set answer

Predictions on train set Train set answer SVM Model Weights

Predictions on train set Train set answer SVM Model Weights

Predictions on train set Train set answer SVM Model Weights Predictions on test set

Predictions on train set Train set answer SVM Model Weights Predictions on test set Model Weights

Predictions on train set Train set answer Model Weights SVM Model Weights Predictions on

Predictions on train set Train set answer Model Weights SVM Model Weights Predictions on test set Final Prediction

SVM Calculates “weight” of features

SVM Calculates “weight” of features

Factorization Machine �Original SVM: • y(x) = w‧x + b = b + Σwixi

Factorization Machine �Original SVM: • y(x) = w‧x + b = b + Σwixi �Factorization Machine: • y(x) = b + Σwixi + ΣΣ(vi‧vj) xixj

Factorization Machine �Original SVM: • y(x) = w‧x + b = b + Σwixi

Factorization Machine �Original SVM: • y(x) = w‧x + b = b + Σwixi �Factorization Machine: • y(x) = b + Σwixi + ΣΣ(vi‧vj) xixj i=0 j=i+1 Interaction between variables

(vi‧vj )? W

(vi‧vj )? W

(vi‧vj )? W

(vi‧vj )? W

(vi‧vj )? ? W

(vi‧vj )? ? W

(vi‧vj )? CF Matrix W

(vi‧vj )? CF Matrix W

(vi‧vj )? W = V k T V

(vi‧vj )? W = V k T V

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj)

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj) xixj i=0 j=i+1

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj)

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj) xixj i=0 j=i+1

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj)

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj) xixj i=0 j=i+1

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj)

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj) xixj i=0 j=i+1 = v TI ‧ v. A

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj)

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj) xixj i=0 j=i+1 = v TI ‧ v. A

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj)

(vi‧vj )? W �y(x) = V T V = b + Σwixi + ΣΣ(vi‧vj) xixj i=0 j=i+1 = v TI ‧ v. A

(vi‧vj )? W = V T V Factorization �y(x) = b + Σwixi +

(vi‧vj )? W = V T V Factorization �y(x) = b + Σwixi + ΣΣ(vi‧vj) xixj i=0 j=i+1

(vi‧vj )? W = V Machine �y(x) T V Factorization = b + Σwixi

(vi‧vj )? W = V Machine �y(x) T V Factorization = b + Σwixi + ΣΣ(vi‧vj) xixj i=0 j=i+1

Factorization Machine

Factorization Machine

W

W

FM V. S. SVM �SVM fails with sparsity �FM learn with sgd, SVM learn

FM V. S. SVM �SVM fails with sparsity �FM learn with sgd, SVM learn with dual

FM V. S. SVM Polynomial kernel SVM Compare to FM: Wi, j are all

FM V. S. SVM Polynomial kernel SVM Compare to FM: Wi, j are all independent to each other.

FM V. S. MF �MF: • y( x ) = b + wu +

FM V. S. MF �MF: • y( x ) = b + wu + wi + vu‧vi �SVD++: • y( x ) = b + wu + wi + vu‧vi + (1/√|Nu|)Σvi‧vl �Claims that FM is more general

Thanks!

Thanks!