Seesaw Personalized Web Search Jaime Teevan MIT with
Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR
Personalization Algorithms l Standard IR Query Document Server Client User l Query expansion
Personalization Algorithms l Standard IR Query Document Server Client User l Query expansion v. Result re-ranking
Result Re-Ranking l Ensures privacy l Good evaluation framework l Can look at rich user profile l Look at light weight user models ¡ Collected on server side ¡ Sent as query expansion
Seesaw Search Engine dog cat india mit search amherst vegas 1 10 2 4 93 12 1
Seesaw Search Engine ry e qu dog cat india mit search amherst vegas 1 10 2 4 93 12 1
Seesaw Search Engine ry e qu dog cat india mit search amherst vegas 1 10 2 4 93 12 1 forest hiking dog cat baby walking monkey gorp csail mit infant baby artificial child boy banana infant food research girl child boy web robot girl search retrieval ir hunt
Seesaw Search Engine ry e qu Search results page 6. 0 1. 6 dog cat india mit search amherst vegas 1 10 2 4 93 12 1 2. 7 0. 2 web search retrieval ir hunt 0. 2 1. 3
Calculating a Document’s Score l Based on standard tf. idf web search retrieval ir hunt 1. 3
Calculating a Document’s Score l Based on standard tf. idf wi = log l User (ri+0. 5)(N-ni-R+ri+0. 5) (ni-ri+0. 5)(R-ri+0. 5) as relevance feedback ¡ Stuff I’ve Seen index ¡ More is better 0. 1 0. 5 0. 05 0. 3 1. 3
Finding the Score Efficiently l Corpus representation (N, ni) ¡ Web statistics ¡ Result set l Document representation ¡ Download document ¡ Use result set snippet l Efficiency hacks generally OK!
Evaluating Personalized Search l 15 evaluators l Evaluate 50 results for a query ¡ Highly relevant ¡ Relevant ¡ Irrelevant l Measure ¡ DCG(i) algorithm quality = { Gain(i), DCG(i– 1) + Gain(i)/log(i), if i = 1 otherwise
Evaluating Personalized Search l Query selection ¡ Chose from 10 pre-selected queries ¡ Previously issued query Pre-selected Joe Las Vegas rice Mc. Donalds … cancer Microsoft traffic … Total: 137 53 pre-selected (2 -9/query) bison frise Red Sox airlines … Mary
Seesaw Improves Text Retrieval l Random l Relevance Feedback l Seesaw
Text Features Not Enough
Take Advantage of Web Ranking
Further Exploration l Explore larger parameter space l Learn parameters ¡ Based on individual ¡ Based on query ¡ Based on results l Give user control?
Making Seesaw Practical l Learn most about personalization by deploying a system l Best algorithm reasonably efficient l Merging server and client ¡ Query l expansion Get more relevant results in the set to be re-ranked ¡ Design snippets for personalization
User Interface Issues l Make personalization transparent l Give user control over personalization ¡ Slider between Web and personalized results ¡ Allows for background computation l Creates problem with re-finding ¡ Results change as user model changes ¡ Thesis research – Re: Search Engine
Thank you! teevan@csail. mit. edu
- Slides: 21