The Perfect Search Engine Is Not Enough Jaime

  • Slides: 30
Download presentation
The Perfect Search Engine Is Not Enough Jaime Teevan, MIT with Christine Alvarado, Mark

The Perfect Search Engine Is Not Enough Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Let Me Interview You! l Web: –What’s the last Web page you visited? How

Let Me Interview You! l Web: –What’s the last Web page you visited? How did you get there? –Have you looked for anything on the Web? l Email: –What’s the last email you read? What did you do with it? –Have you gone back to an email you’ve read before? l Files: –What’s the last file you looked at? How did you get to it? –Have you looked for a file?

Overview: Understanding Directed Search l l Introduction Related work Methodology What we learned –

Overview: Understanding Directed Search l l Introduction Related work Methodology What we learned – – How? Prefer to search in steps Why? Because it’s easier Who? Step size varies by person So what?

Haystack: Personal Information Storage Email Haystack Files Web pages Calendar Contacts

Haystack: Personal Information Storage Email Haystack Files Web pages Calendar Contacts

Directed Search in Haystack What was that paper I read last week about Information

Directed Search in Haystack What was that paper I read last week about Information Retrieval? Haystack

Directed Search in Haystack Ah yes! Thank you. Haystack

Directed Search in Haystack Ah yes! Thank you. Haystack

…Or Elsewhere Ah yes! Thank you. “Perfect Search Engine”

…Or Elsewhere Ah yes! Thank you. “Perfect Search Engine”

Related Work l Directed search – – l l Lab studies [Capra 03, Maglio

Related Work l Directed search – – l l Lab studies [Capra 03, Maglio 97] Log analysis [Broder 02, Spink 01] Observational studies [Malone 83] Information Seeking – – Marchionini, O’Day and Jeffries, Bates, Belkin, … Evolving information need

Modified Diary Study l l l Subjects: 15 CS graduate students Ten interviews each

Modified Diary Study l l l Subjects: 15 CS graduate students Ten interviews each (2/day x 5 days) Two question types – – l Last email/file/Web page looked at Last email/file/Web page looked for Supplemented with direct observation and an hour-long semi-structured interview

Overview: Understanding Directed Search l l Introduction Related work Methodology What we learned –

Overview: Understanding Directed Search l l Introduction Related work Methodology What we learned – – How? Why? Who? So what?

Directed Search Today l Target: Connie Monroe’s office number Type into a search engine:

Directed Search Today l Target: Connie Monroe’s office number Type into a search engine: “Connie Monroe, office number”

What We Observed Interviewer: Have you looked for anything on the Web today? Jim:

What We Observed Interviewer: Have you looked for anything on the Web today? Jim: I had to look for the office number of the Harvard professor. I: So how did you go about doing that? J: I went to the homepage of the Math department at Harvard

What We Observed I: So you went to the Math department, and then what

What We Observed I: So you went to the Math department, and then what did you do over there? J: It had a place where you can find people and I went to that page and they had a dropdown list of visiting faculty, and so I went to that link and I looked for her name and there it was.

What We Observed J: I knew that she had a very small Web page

What We Observed J: I knew that she had a very small Web page saying, “I’m here at Harvard. Here’s my contact information. ”

Strategies Looking for Information Teleporting Orienteering

Strategies Looking for Information Teleporting Orienteering

Why Do People Orienteer? l l The tools don’t work Easier than saying what

Why Do People Orienteer? l l The tools don’t work Easier than saying what you want You know where you are You know what you find

Easier Than Saying What You Want l Describing the target is hard – –

Easier Than Saying What You Want l Describing the target is hard – – l Habit – l Can’t Prefer not to “Whichever way I remember first. ” Search for source – E. g. , Your last email search

You Know Where You Are l Stay in known space – – – l

You Know Where You Are l Stay in known space – – – l URL manipulation Bookmarks History Backtracking – – Following an information scent Never end up at a dead end

You Know What You Find l Context gives understanding of answer “I was looking

You Know What You Find l Context gives understanding of answer “I was looking for a specific file. But even when I saw its name, I wouldn’t have known that was the file I wanted until I saw all of the other names in the same directory…” l Understanding negative results “I basically clicked on every single button until I was convinced… I don’t think that it exists…”

Individual Strategies l l Search strategies varied by individual People who pile information take

Individual Strategies l l Search strategies varied by individual People who pile information take small steps People who file information take big steps Where was the last email you found? – – Inbox? Elsewhere?

File or Pile Email Filer Piler

File or Pile Email Filer Piler

How Individuals Search For Files Filers Big steps Pilers Small steps

How Individuals Search For Files Filers Big steps Pilers Small steps

Applying What We Learned Support orienteering l Advantages to orienteering – – – l

Applying What We Learned Support orienteering l Advantages to orienteering – – – l Easier thansource, saying flag whatsources you want Meta-info, with info You where you are apparent, all steps URLknow manipulation, paths You know what you find sources, exhaustive Answer context, trusted Individual differences in step size – Allow for different step sizes

Structural Consistency Important All must be the same to re-find the information!

Structural Consistency Important All must be the same to re-find the information!

Preserve What User Remembers l l Supports orienteering for re-finding Allows access to new

Preserve What User Remembers l l Supports orienteering for re-finding Allows access to new information

More to Learn from the Data Differences in finding v. re-finding l How organization

More to Learn from the Data Differences in finding v. re-finding l How organization relates to search l Importance of type (email, files and Web) l Looked at v. looked for Keep in mind population l

Questions? Teevan, J. , Alvarado, C. , Ackerman, M. S. and Karger, D. R.

Questions? Teevan, J. , Alvarado, C. , Ackerman, M. S. and Karger, D. R. (2004). The Perfect Search Engine is Not Enough: A Study of Orienteering Behavior in Directed Search. To appear in Proceedings of CHI 2004. (Linked from http: //www. teevan. org)

Relating How and What l l l Specific General Document Other 47 19 41

Relating How and What l l l Specific General Document Other 47 19 41 Keyword 34 23 17 People only keyword search 39% of the time What people look for related to how they look Surprise: Orienteer to specific information

Relating How and Corpus Other Keyword l l l Email 59 Files 42 Web

Relating How and Corpus Other Keyword l l l Email 59 Files 42 Web 19 06 10 64 Email and files: Almost never keyword searched Easy to associate information with document Web: Used keyword search much more often

Relating What and Corpus Specific General Document l l l Email 39 10 08

Relating What and Corpus Specific General Document l l l Email 39 10 08 Files 7 7 35 Web 33 30 14 Email searches were primarily for specific information File searches were primarily for documents Web searches were more evenly distributed