SIMS 247 Information Visualization and Presentation Marti Hearst

  • Slides: 44
Download presentation
SIMS 247: Information Visualization and Presentation Marti Hearst Feb 18, 2004 1

SIMS 247: Information Visualization and Presentation Marti Hearst Feb 18, 2004 1

Today • Multidimensional Visualization – Table Lens – Parallel Coordinates • Intro paper •

Today • Multidimensional Visualization – Table Lens – Parallel Coordinates • Intro paper • Example of usage – Attribute Explorer – Comparative Evaluation of Three Systems • Design Problem 2

Table Lens • Super Spreadsheets – Combines overview + details in an integrated view

Table Lens • Super Spreadsheets – Combines overview + details in an integrated view – Focus + Context allows for compressed representation – Sorting multiple columns allows patterns to emerge – Represents nominal data in a way that allows patterns to appear • Demos: http: //www. inxight. com/products/core/table_lens/demos. php 3

Multidimensional Detective A. Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization (Info.

Multidimensional Detective A. Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization (Info. Vis '97), 1997. 4

Inselberg’s Principles A. Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization (Info.

Inselberg’s Principles A. Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization (Info. Vis '97), 1997 1. Do not let the picture scare you 2. Understand your objectives – Use them to obtain visual cues 3. Carefully scrutinize the picture 4. Test your assumptions, especially the “I am really sure of’s” 5. You can’t be unlucky all the time! 5

A Detective Story A. Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization

A Detective Story A. Inselberg, Multidimensional Detective, Proceedings of IEEE Symposium on Information Visualization (Info. Vis '97), 1997 • The Dataset: – Production data for 473 batches of a VLSI chip – 16 process parameters: – X 1: The yield: % of produced chips that are useful – X 2: The quality of the produced chips (speed) – X 3 … X 12: 10 types of defects (zero defects shown at top) – X 13 … X 16: 4 physical parameters • The Objective: – Raise the yield (X 1) and maintain high quality (X 2) 6

Do Not Let the Picture Scare You!! 7

Do Not Let the Picture Scare You!! 7

Multidimensional Detective • Each line represents the values for one batch of chips •

Multidimensional Detective • Each line represents the values for one batch of chips • This figure shows what happens when only those batches with both high X 1 and high X 2 are chosen • Notice the separation in values at X 15 • Also, some batches with few X 3 defects are not in this high-yield/high-quality group. 8

Multidimensional Detective • Now look for batches which have nearly zero defects. – For

Multidimensional Detective • Now look for batches which have nearly zero defects. – For 9 out of 10 defect categories • Most of these have low yields • This is surprising because we know from the first diagram that some defects are ok. 9

Go back to first diagram, looking at defect categories. Notice that X 6 behaves

Go back to first diagram, looking at defect categories. Notice that X 6 behaves differently than the rest. Allow two defects, where one defect in X 6. This results in the very best batch appearing. 10

Multidimensional Detective • Fig 5 and 6 show that high yield batches don’t have

Multidimensional Detective • Fig 5 and 6 show that high yield batches don’t have non-zero values for defects of type X 3 and X 6 – Don’t believe your assumptions … • Looking now at X 15 we see the separation is important – Lower values of this property end up in the better yield batches 11

Automated Analysis A. Inselberg, Automated Knowledge Discovery using Parallel Coordinates, INFOVIS ‘ 99 12

Automated Analysis A. Inselberg, Automated Knowledge Discovery using Parallel Coordinates, INFOVIS ‘ 99 12

Integrating Viz into a UI • Vizcraft: Viz. Craft: A Problem-Solving Environment for Aircraft

Integrating Viz into a UI • Vizcraft: Viz. Craft: A Problem-Solving Environment for Aircraft Configuration Design, Goe, Baker, Shaffer, Grossman, Mason, Watson, Haftka, IEEE Computing, pp. 56 -66, 2001 • Solving an Analysis Problem – Optimizing design of aircraft • Uses of Viz: – – Brushing and linking Color Multiple views Parallel Coordinates 13

Use of Color in Vizcraft Good Incorrect Not Sure 14

Use of Color in Vizcraft Good Incorrect Not Sure 14

Doing Analysis in Viz. Craft Colored according to value in first attribute Shows that

Doing Analysis in Viz. Craft Colored according to value in first attribute Shows that 2 nd and N-6 th are correlated with 1 st 15

Doing Analysis in Viz. Craft Colored according to value in fifth attribute Shows that

Doing Analysis in Viz. Craft Colored according to value in fifth attribute Shows that 5 th and 7 th attributes are correlated 16

Doing Analysis in Viz. Craft Select only low values of 1 st variable (normalized

Doing Analysis in Viz. Craft Select only low values of 1 st variable (normalized after the fact) The idea is to learn about the acceptable ranges for the values of the other variables 17

Doing Analysis in Viz. Craft Color according to one constraint Confusing – using the

Doing Analysis in Viz. Craft Color according to one constraint Confusing – using the constraint colors in two ways simultaneously. 18

Comparing 3 Commercial Systems Alfred Kobsa, An Empirical Comparison of Three Commercial Information Visualization

Comparing 3 Commercial Systems Alfred Kobsa, An Empirical Comparison of Three Commercial Information Visualization Systems, INFOVIS'01. 19

Eureka (Table Lens) 20

Eureka (Table Lens) 20

Spotfire (IVEE) 21

Spotfire (IVEE) 21

Info. Zoom 22

Info. Zoom 22

Infozoom Presents data in three different views: – Overview mode has all attributes in

Infozoom Presents data in three different views: – Overview mode has all attributes in ascending or descending order and independent of each other. • Best for data exploration – Wide view shows data set in a table format • A column represents a data item • Like a conventional spreadsheet – Compressed view packs the data set horizontally to fit the window width. • A column represents a data item • Zoomed-out view like Table Lens 23

Info. Zoom Overview View Slide by Alfred Kobsa 24

Info. Zoom Overview View Slide by Alfred Kobsa 24

Info. Zoom Overview View 25

Info. Zoom Overview View 25

Info. Zoom Overview View (with hierarchy) Slide by Alfred Kobsa 26

Info. Zoom Overview View (with hierarchy) Slide by Alfred Kobsa 26

Info. Zoom Wide Table View (columns are meaningful) 27

Info. Zoom Wide Table View (columns are meaningful) 27

Info. Zoom Wide Table View 28

Info. Zoom Wide Table View 28

Info. Zoom Compressed Table View 29

Info. Zoom Compressed Table View 29

Datasets for Study Multidimensional data: three databases were used • Anonymized data from a

Datasets for Study Multidimensional data: three databases were used • Anonymized data from a web based dating service (60 records, 27 variables) • Technical data of cars sold in 1970 – 82 (406 records, 10 variables) • Data on the concentration of heavy metals in Sweden (2298 records, 14 variables) Slide by Kunal Garach 30

Sample Questions • Dating database – Do more women than men want their partners

Sample Questions • Dating database – Do more women than men want their partners to have a higher education? – What proportion of the men live in California? – Do all people who think the bar is a good place to meet a mate also believe in love at first site? • Car database – Do heavier cars have more horsepower? – Which manufacturer produced the most cars in 1980? – Is there a relationship between the displacement and acceleration of a vehicle? 31

Experiment Design The experimenters generated 26 tasks from all three data sets. 83 participants.

Experiment Design The experimenters generated 26 tasks from all three data sets. 83 participants. Between-subjects design. Each was given one visualization system and all three data sets. Type of visualization system was the independent variable between them. 30 mins were given to solve the tasks of each data set i. e 26 tasks in 90 mins. Slide by Kunal Garach 32

Overall Results • Mean task completion times: • Infozoom users: 80 secs • Spotfire

Overall Results • Mean task completion times: • Infozoom users: 80 secs • Spotfire users: 107 secs • Eureka users: 110 secs • Answer correctness: • Infozoom users: 68% • Spotfire users: 75% • Eureka users: 71% • Not a time-error tradeoff • Spotfire more accurate on only 6 questions Slide by Kunal Garach 33

Eureka - problems Hidden labels: Labels are vertically aligned, max 20 dimensions Problems with

Eureka - problems Hidden labels: Labels are vertically aligned, max 20 dimensions Problems with queries involving 3 or more attributes Correlation problems: Some participants had trouble answering questions correctly that involved correlations between two attributes. Slide by Kunal Garach 34

Spotfire - problems Cognitive setup costs: Takes participants considerable time to decide on the

Spotfire - problems Cognitive setup costs: Takes participants considerable time to decide on the right representation and to correctly set the coordinates and parameters. Biased by scatterplot default: Though powerful, many problems cannot be solved (well) with it. Slide by Kunal Garach 35

Infozoom - problems Erroneous Correlations People forget/don’t realize that overview mode has all attributes

Infozoom - problems Erroneous Correlations People forget/don’t realize that overview mode has all attributes sorted independent of each other Narrow height in compressed view Participants did not use row expansion and scatterplot charting function which shows correlations more accurately Slide by Kunal Garach 36

Geographic Questions • Spotfire should have done better on these • Which part of

Geographic Questions • Spotfire should have done better on these • Which part of the country has the most copper • Is there a relationship between the concentration of vanadin and that of zinc? • Is there a low-level chrome area that is high in vanadim? • Spotfire was only better only for the last question (out of 6 geographic ones) 37

Discussion • Many studies of this kind use relatively simple tasks that mirror the

Discussion • Many studies of this kind use relatively simple tasks that mirror the strengths of the system • Find the one object with the maximum value for a property • Count how many of certain attributes there are • This study looked at more complex, realistic, and varied questions. 38

Discussion Success of a visualization system depends on many factors: • Properties supplied •

Discussion Success of a visualization system depends on many factors: • Properties supplied • Spotfire doesn’t visualize as many dimensions simultaneously • Operations • Zooming easy in Info. Zoom; allows for drill-down as well • Zooming in Eureka causes context to be lost • Column view in Eureka makes labels hard to see 39

Assignment 3 • Due March 3 • Work in pairs (encouraged, not required) •

Assignment 3 • Due March 3 • Work in pairs (encouraged, not required) • Exploratory Data Analysis 40

Design Exercise: How to Visualize EASPD • Pure serial periodic data – A single

Design Exercise: How to Visualize EASPD • Pure serial periodic data – A single continuous dimension in which each period has equal duration – Example: days of the week • Event-anchored serial periodic data – Data has periods with different durations 41

Design Exercise: How to Visualize EASPD • Event-anchored serial periodic data – Data has

Design Exercise: How to Visualize EASPD • Event-anchored serial periodic data – Data has periods with different durations • Examples: – Multi-day races (Tour de France) – May want to discern • Is a racer improving starts and finishes as the race progresses? • Does a racer peak more rapidly after long stages than short ones? – Project-based time tracking • How is a worker’s efficiency effected by the pattern and number of projects? 42

Design Exercise: How to Visualize EASPD • Event-anchored serial periodic data – Data has

Design Exercise: How to Visualize EASPD • Event-anchored serial periodic data – Data has periods with different durations • Examples: – Eating habits of a foraging animal • Eats different foods, different amounts, in different seasons • Start/end of season varies based on when the rains begin 43

Next Time: • Problem Analysis Example (Carlis & Konstan) • Focus + Context •

Next Time: • Problem Analysis Example (Carlis & Konstan) • Focus + Context • Zooming – Standard and Semantic • Distortion-based Views 44