Learning Recommender Systems with Adaptive Regularization Steffen Rendle
- Slides: 24
Learning Recommender Systems with Adaptive Regularization Steffen Rendle WSDM 2012 Presenter: Haiqin Yang Date: Mar. 21 2012
Outline n n n Introduction Factorization Machine with Adaptive Regularization Evaluation Conclusion More stories
Collaborative Filtering n Predict unobserved entries based on partial observed matrix
Overfitting n n Most state-of-the-art recommender methods have a large number of model parameters and thus are prone to overfiting. Low Rank Approximation
Solution to Overfitting n Typically L 2 -regularization is applied to prevent overtting, e. g. : q Maximum margin matrix factorization q Probabilistic matrix factorization
Regularization Parameters n A generalized formulation n The success depends largely on the choice of the value(s) for q q n If is chosen too small, the model overfits. If is chosen too small, the model underfits. Question: How to choose efficiently?
How to select parameters? n Validation Set Based Methods – Search for optimal values using a withheld validation set q Grid search by cross validation
How to select parameters? n Validation Set Based Methods – Search for optimal values using a withheld validation set q q q Grid search by cross validation Informed search: too complicated Regularization path: not common for all cases
How to select parameters? n Validation Set Based Methods – Search for optimal values using a withheld validation set q q q n Grid search by cross validation Informed search: too complicated Regularization path: not common for all cases Hierarchical Bayesian Methods – Use a hierachical model with hyperpriors on prior distribution q Typically optimized with Markov Chain Monte Carlo (MCMC)
Factorization Machine (FM) n n Matrix Factorization (MF) Factorization Machine Model: Parameters:
FM vs. Other Factorization Models n n FM Generalization q q q n MF SVD++ Pairwise Interaction Tensor Factorization (PITF) Factorization Personalized Markov Chains (FPMC) See “Factorization Machines” in ICDM 2010 An example: MF Let
Optimization and Algorithm n Optimization Target n Square loss n Gradient descent
Adaptive Regularization n n Split two datasets Find the regularization values * that lead to the lowest error on the validation set Alternating optimization Problem: the right hand size is independent of
Adaptive Regularization n Hint: Next parameters depend on n Recall n n Objective n Update rule Expansion
Adaptive Regularization n Update rule n Gradients
Evaluation n Datasets q q n Movielen 1 M Netflix Methods q q Stochastic Gradient Descent (SGD) SGD with Adaptive regularization (SGDA)
Accuracy vs. Latent Dimensions
Convergence
Evolution of n Flexible regularization is better than one regularization value for all dimensions
Size of Validation Set Sv n n The larger the validation set, the close to the test set Too larger validation set reduces training size, yielding poor performance
Conclusion n n An adaptive regularization method based on the Factorization Machine Systematical experiments to demonstrate the model performance
More Stories n Reformulate the problem to create a new model: Factorization Machine q q Factorization machines, ICDM 2010 Fast context-aware recommendations with factorization machines, SIGIR 2011 Learning recommender systems with adaptive regularization, WSDM 2012 Bayesian factorization machines, NIPS 2011 Workshop
More Stories n n Modify existing techniques for new models Predictor-Corrector
Q&A
- Steffen rendle
- Introduction to recommender systems
- Recommender systems: an introduction
- Weighted hybrid recommender systems
- Cutout regularization
- Regularized cost function
- Graph laplacian regularization
- Nn regularization
- Latent factors recommender system
- Knowledge based recommendation
- Christoph hessler
- Steffen høy pedersen
- Steffen dutsch
- Steffen
- Ablauf der personalentwicklung
- Flowfact kosten
- Steffen dietzel
- Scott steffen hockey
- Steffen hellmold
- Conny
- Adaptive learning neural network
- Smart sparrow
- Cuadro comparativo de e-learning
- Decision support systems and intelligent systems
- Principles of complex systems for systems engineering