Document Collections cs 5984 Information Visualization Chris North
- Slides: 23
Document Collections cs 5984: Information Visualization Chris North
Where are we? • • Multi-D 1 D 2 D Hierarchies/Trees Networks/Graphs Document collections 3 D • • • Design Principles Empirical Evaluation Java Development Visual Overviews Multiple Views Peripheral Views
Structured Document Collections • Multi-dimensional • author, title, date, journal, … • Trees • dewey decimal • Networks • web, citations
Envision • Ed Fox, et al. • Multi-D • similar to Spotfire
Unstructured Document Collections • Focus on Full Text • Examples: • digital libraries, encyclopedia • Web, homepages, photo collections • Tasks: • • search, keyword Browse Themes, subjects, topics, library coverage Size, distributions
Visualization Strategies • • • today Cluster Maps Keyword Query today Relationships Reduced representation User controlled layout
Cluster Map • Create a “map” of the document collection • Similar documents near • Dissimilar document far • “Grocery store” concept
Document Vectors • • “aardvark” “banana” “chris” … Doc 1 1 2 0 Doc 2 2 1 0 Doc 3 0 0 3 • Similarity between pair of docs = • • Layout documents in 2 -D map by similarity • similar to spring model for graph layout …
Cluster Algorithms • Partition clustering: Partition into k subsets • Pick k seeds • Iteratively attract nearest neighbors • Hierarchical clustering: Dendrogram • Group nearest-neighbor pair • Iterate
Kohonen Maps • Xia Lin, “Document Space” • samal, ying • http: //faculty. cis. drexel. edu/sitemap/index. html
Themescapes, Cartia • PNL • Mountain height = Cluster size
Web. SOM • http: //websom. hut. fi/websom/
Map. net • http: //maps. map. net/start
Cluster Map • Good: • • Map of collection Major themes and sizes Relationships between themes Scales up • Bad: • Where to locate documents with multiple themes? » Both mountains, between mountains, …? • Relationships between documents, within documents? • Algorithm becomes (too) critical
Keyword Query • Keyword query, Search engine • Rank ordered list • “Information Retrieval”
Tilebars • Hearst, “Tilebars” • reenal, xueqi • http: //elib. cs. berkeley. edu/tilebars/
VIBE • Korfhage, http: //www. pitt. edu/~korfhage/interfaces. html • Documents located between query keywords using spring model
VR-VIBE
Keyword Query • Good: • Reduces the browsing space • Map according to user’s interests • Bad: • What keywords do I use? • What about other related documents that don’t use these keywords? • No initial overview • Mega-hit, zero-hit problem
Assignment • Thurs: Document Collections • Bederson, “Image Browsing” » Rui, anusha • Card, “Web Book and Web Forager” » mrinmayee, ming • Demo your hw 3: tues or thurs
Next Week • Tues: 3 -D data • Kniss, “Interactive Volume Rendering with Direct Manip” » xueqi, mahesh • Thurs: Workspaces • Robertson, “Task Gallery” » supriya, varun • Upson, “AVS” » christa, jun • Thanksgiving break • Tues 27: Debates • Kobsa, “Empirical comparison of comm infovis systems” » kunal, zhiping
Upcoming Sched • • • Tues: 3 -D data Thurs: Workspaces Thanksgiving break Tues 27: Debates Thurs 29: How (not) to lie with visualization Dec: project presentations • Dec 7: CHI 2 -pagers due, student posters due
- Da form 5984 e
- Ece 5984
- Chris yano dui
- Chris north properties
- Information visualization ppt
- Introduction to information visualization
- Information visualization
- 자바스크립트 쿠키
- True north vs magnetic north
- Cumbria and north east ics
- Lesson quiz 14-1 north and south
- The north pole ____ a latitude of 90 degrees north
- Using system.collections.generic
- Pse&g collections
- Java collections tree
- Collections trust spectrum
- Certificate in pathology collection
- Huge collections of stars
- Collections overview in java
- Static collection
- Chapter 20 patient collections and financial management
- Object model and collections in dhtml
- 631-828-3140
- Collection management sap