Personalized Ranking Model Adaptation for Web Search Hongning

Personalized Ranking Model Adaptation for Web Search Hongning Wang 1, Xiaodong He 2, Ming-Wei Chang 2, Yang Song 2, Ryen W. White 2 and Wei Chu 3 1 Department of Computer Science University of Illinois at Urbana-Champaign Urbana IL, 61801 USA wang 296@illinois. edu 2 Microsoft Research, Redmond WA, 98007 USA 3 Microsoft Bing, Bellevue WA, 98004 USA {yangsong, minchang, xiaohe, ryenw, wechu}@m icrosoft. com

Searcher’s information needs are diverse • Exploring user’s search preferences SIGIR 2013 @ Dublin Ireland 2

Personalization for web search • Exploring user’s search preferences SIGIR 2013 @ Dublin Ireland 3

Existing methods for personalization • Extracting user-centric features [Teevan et al. SIGIR’ 05] • Location, gender, click history • Require large volume of user history • Memory-based personalization [White and Drucker WWW’ 07, Shen et al. SIGIR’ 05] • Learn direct association between query and URLs • Limited coverage, poor generalization SIGIR 2013 @ Dublin Ireland 4

Personalization for web search • Major considerations • Accuracy • Maximize the search utility for each single user • Efficiency • Executable on the scale of all the search engine users • Adapt to the user’s result preferences quickly SIGIR 2013 @ Dublin Ireland 5

Personalized Ranking Model Adaptation • Adapting the global ranking model for each individual user SIGIR 2013 @ Dublin Ireland 6

Personalized Ranking Model Adaptation • Adjusting the generic ranking model’s parameters with respect to each individual user’s ranking preferences SIGIR 2013 @ Dublin Ireland 7

Linear Regression Based Model Adaptation • Adapting global ranking model for each individual user Lose function from any linear learning-to-rank algorithm, e. g. , Rank. Net, Lambda. Rank, Rank. SVM Complexity of adaptation SIGIR 2013 @ Dublin Ireland 8

Instantiation example • Adapting Rank. SVM [Joachims KDD’ 02] Margin rescaling reducing mis-ordered pairs Non-linear kernels SIGIR 2013 @ Dublin Ireland 9

Ranking feature grouping I • Grouping features by name - Name • Exploring informative naming scheme • BM 25_Body, BM 25_Title • Clustering by manually crafted patterns Page. Rank Group 3 Group 2 Group 1 BM 25_Title BM 25_Body BM 25_Anchor tfidf_title <qn, dj> 1. 0 1. 3 0. 7 0. 2 0. 9 <qn, dj> 0. 8 0. 2 0. 3 0. 1 <qm, dk> 0. 2 0. 7 0. 6 0. 2 0. 5 SIGIR 2013 @ Dublin Ireland 10

Ranking feature grouping II • Co-clustering of documents and features – SVD [Dhillon KDD’ 01] • SVD on document-feature matrix • k-Means clustering to group features Page. Rank BM 25_Title BM 25_Body BM 25_Anchor tfidf_title <qn, dj> 1. 0 1. 3 0. 7 0. 2 0. 9 <qn, dj> 0. 8 0. 2 0. 3 0. 1 <qm, dk> 0. 2 0. 7 0. 6 0. 2 0. 5 SVD + k-Means SIGIR 2013 @ Dublin Ireland 11

Ranking feature grouping III • Clustering features by importance - Cross • Estimate linear ranking model on different splits of data • k-Means clustering by feature weights in different splits Page. Rank BM 25_Title BM 25_Body BM 25_Anchor tfidf_title model 1 0. 20 1. 23 0. 37 0. 32 -0. 19 model 2 0. 78 0. 25 -0. 32 0. 19 0. 21 model 3 0. 14 0. 37 0. 16 0. 22 0. 15 k-Means SIGIR 2013 @ Dublin Ireland 12

Discussions • A general framework for ranking model adaptation • Model-based adaptation v. s. {instance, feature}-based adaptation • Within the same optimization complexity as the original ranking model • Adaptation sharing across features to reduce the requirement of adaptation data SIGIR 2013 @ Dublin Ireland 13

Experimental Results • Dataset • Bing. com query log: May 27, 2012 – May 31, 2012 • Manual relevance annotation • 5 -grade relevance score • 1830 ranking features • BM 25, Page. Rank, tf*idf and etc. SIGIR 2013 @ Dublin Ireland 14

Comparison of adaptation performance • Baselines • Tar-Rank. SVM • No adaptation, user’s own data only • RA-Rank. SVM [Geng et al. TKDE’ 12] Applicable in per-user basis adaptation • Model-based: global model as regularization • Trans. Rank [Chen et al. ICDMW'08] • Instance-based: reweight annotated queries for adaptation • IW-Rank. SVM [Gao et al. SIGIR’ 10] Only applicable in aggregated adaptation • Instance-based: reweight user’s click data for adaptation • CLRank [Chen et al. Information Retrieval’ 10] • Feature-based: construct new feature representation for adaptation SIGIR 2013 @ Dublin Ireland 15

Adaptation accuracy I • Per-user basis adaptation SIGIR 2013 @ Dublin Ireland 16

Adaptation accuracy II • Aggregated adaptation SIGIR 2013 @ Dublin Ireland 17

Improvement analysis I • Query-level improvement • Against global model SIGIR 2013 @ Dublin Ireland 18

Improvement analysis II • User-level improvement • Against global model SIGIR 2013 @ Dublin Ireland 19

Adaptation efficiency I • Batch mode SIGIR 2013 @ Dublin Ireland 20

Adaptation efficiency II • Online mode SIGIR 2013 @ Dublin Ireland 21

Conclusions • Efficient ranking model adaption framework for personalized search • Linear transformation for model-based adaptation • Transformation sharing within a group-wise manner • Future work • Joint estimation of feature grouping and model transformation • Incorporate user-specific features and profiles • Extend to non-linear models SIGIR 2013 @ Dublin Ireland 22

References 1. White, Ryen W. , and Steven M. Drucker. "Investigating behavioral variability in web search. " Proceedings of the 16 th international conference on World Wide Web. ACM, 2007. 2. Shen, Xuehua, Bin Tan, and Cheng. Xiang Zhai. "Context-sensitive information retrieval using implicit feedback. " Proceedings of the 28 th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2005. 3. Teevan, Jaime, Susan T. Dumais, and Eric Horvitz. "Personalizing search via automated analysis of interests and activities. " Proceedings of the 28 th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2005. 4. Burges, Chris, et al. "Learning to rank using gradient descent. " Proceedings of the 22 nd international conference on Machine learning. ACM, 2005. Burges, Chris, Robert Rango and Quoc Viet Le. "Learning to rank with nonsmooth cost functions. "Proceedings of the Advances in Neural Information Processing Systems 19 (2007): 193 -200. 6. Joachims, Thorsten. "Optimizing search engines using clickthrough data. "Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2002. 7. Dhillon, Inderjit S. "Co-clustering documents and words using bipartite spectral graph partitioning. " Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2001. 8. Geng, Bo, et al. "Ranking model adaptation for domain-specific search. "Knowledge and Data Engineering, IEEE Transactions on 24. 4 (2012): 745 -758. 9. Chen, Depin, et al. "Transrank: A novel algorithm for transfer of rank learning. "Data Mining Workshops, 2008. ICDMW'08. IEEE International Conference on. IEEE, 2008. 10. Gao, Wei, et al. "Learning to rank only using training data from related domain. "Proceedings of the 33 rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 2010. 11. Chen, Depin, et al. "Knowledge transfer for cross domain learning to rank. "Information Retrieval 13. 3 (2010): 236 -253. SIGIR 2013 @ Dublin Ireland 23

Thank you! Q&A SIGIR 2013 @ Dublin Ireland 24

Notations • Query collection • from user • for each query • is a V-dimensional vector of ranking features for a retrieved document • is the corresponding relevance label • Ranking model • • Focusing on linear ranking models • SIGIR 2013 @ Dublin Ireland 25

Instantiation I • Adapting Rank. Net [Burges et al. ICML’ 05] & Lambda. Rank [Burges etal. NIPS’ 07] • Objective function • Regularization SIGIR 2013 @ Dublin Ireland 26

Instantiation I • Adapting Rank. Net & Lambda. Rank • Derived gradients Group-wise updating SIGIR 2013 @ Dublin Ireland 27

Analysis of feature grouping • Effectiveness of different grouping method • Baseline: random grouping and no grouping SIGIR 2013 @ Dublin Ireland 28