Searching and Browsing Using Tags Nikos Sarkas Social
Searching and Browsing Using Tags Nikos Sarkas Social Information Systems Seminar DCS, University of Toronto, Winter 2007
Social Resource Sharing l The del. icio. us paradigm. l l Users store links to web pages of interest along with arbitrary, user-specified tags in a server. The model is independent of the resource being shared. l l Music (Last. fm) Photos (Flickr) Publications (Cite. ULike) …
Part I: Searching
Ranking Web Search Results l l Two prevalent models. Ranking based on query-document similarity. l l TF/IDF Metadata extraction Link analysis Query independent static ranking. l l Page. Rank “Quality” based
Similarity Ranking, Take I l l Query q={q 1, q 2, …, qn}. Tags of URL p, T(p)={t 1, t 2, …, tm}. Define similarity as |q∩T(p)|/|T(p)|. Problems l l l Synonymy (according to the authors) Others? Synonymy example l Linux, Ubuntu and Gnome
Similarity Ranking, Take II l l Use tags with “similar” meaning to enrich query. Create 3 matrices l l l MTP, tag-URL count matrix ST, tag-tag similarity matrix SP, URL-URL similarity matrix
Similarity Ranking, Take II l Iterate l Similarly update SP, until convergence. Then, similarity between a query q and a url p is l
Social Page. Rank l l “Popular web pages are tagged by many upto-date users, using hot tags”. Transfer popularity between entities. Define matrices MPU, MUT, MTP. Iterate
Putting It All Together l Train a ranking function (Rank. SVM) using the following features l l l BM 25 similarity between query and url content Simple query-url tags similarity measure Complex query-url tags similarity measure Page. Rank Social Page. Rank Results l l Precision, NDCG at k Small improvement over BM 25, up to 25% for NDCG and synthetic queries
Part II: Browsing
Tag Assisted Browsing l Currently two methods for tag driven browsing l l l Keyword search Clouds of popular tags We would like to support l l Semantic browsing: also present URLs annotated with similar tags Hierarchical browsing: browse in a top-down fashion
Semantic Browsing l l Define similarity between tags: Synonymic tags: similarity above a threshold. The synonymic tags and the tag itself defines its semantic concept. Given that the user has selected L tags, that define semantic concepts Sc={C 1, …, CL}, related URLs are:
Hierarchical Browsing l Observations l l No neat tree structure Multiple ways to target resource URLs associated with different categories Dynamic structure: leafs can become inner nodes
Hierarchical Browsing l Generating sub-tags l l l Train a classifier to identify which of the tags in the semantic concept are sub-tags Features used: ratio of tag counts, intersection size, etc. Clustering sub-tags l l Ranks tags based on a complex formula Greedy clustering technique
- Slides: 14