Prefetching for Visual Data Exploration Punit R Doshi

  • Slides: 19
Download presentation
Prefetching for Visual Data Exploration Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward

Prefetching for Visual Data Exploration Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward Computer Science Department Worcester Polytechnic Institute Support: NSF grants IIS-9732897, EIA-9729878, and IIS-0119276.

Overview • Why visually explore data? – Fact: Increasing data set sizes – Need:

Overview • Why visually explore data? – Fact: Increasing data set sizes – Need: Efficient techniques for exploring the data – Possible solution: Interactive Data Visualization -- humans can detect certain patterns better and faster than data mining tools • Why cache and prefetch? – Interactive data visualization tools do not scale well – Interactive real-time response needed – Caching and prefetching improve response time. • Goal: Propose and evaluate prefetching for visualization tools 2

Example Visual Exploration Tool: Xmdv. Tool Flat Display Data Hierarchy Hierarchical Display 3

Example Visual Exploration Tool: Xmdv. Tool Flat Display Data Hierarchy Hierarchical Display 3

Example Visual Exploration Tool: Xmdv. Tool Drill Down: Structure-Based Brush 1 Parallel Coordinates (Linked

Example Visual Exploration Tool: Xmdv. Tool Drill Down: Structure-Based Brush 1 Parallel Coordinates (Linked with Brush 1) Roll-Up: Structure-Based Brush 2 4 Parallel Coordinates (Linked with Brush 2)

Characteristics of a Visualization Environment Characteristics that can be exploited for caching and prefetching:

Characteristics of a Visualization Environment Characteristics that can be exploited for caching and prefetching: • Locality of exploration • Contiguity of user movements • Idle time due to user viewing display Move up/down Move left/right 5

Overview of Semantic Caching • Purpose • reduce response time and network traffic •

Overview of Semantic Caching • Purpose • reduce response time and network traffic • Issues • visual query cannot directly translate into object IDs ® high-level cache specification to avoid complete scans • Semantic Caching: queries are cached rather than objects • minimize cost of cache lookup • dynamically adapt cached queries to patterns of queries GUI cache Client machine DB Server machine 6

In Xmdv. Tool, caching reduced response time by 85% Prefetching can further improve response

In Xmdv. Tool, caching reduced response time by 85% Prefetching can further improve response time. 7

Prefetching • Locality of exploration • Contiguity of user movements • Idle time due

Prefetching • Locality of exploration • Contiguity of user movements • Idle time due to user viewing display User’s next request can be predicted with high accuracy Time to prefetch Fetchin g Idle time New user query Cache Prefetching DB 8

Prefetching Strategies Direction Strategy Random Strategy 1/4 1/4 (m-1) m Mean Strategy (m+1) m(n-1)

Prefetching Strategies Direction Strategy Random Strategy 1/4 1/4 (m-1) m Mean Strategy (m+1) m(n-1) 1/4 m(n) m(n+1) m(n-2) Localized Speculative Strategies Exponential Weight Average Strategy Focus Strategy m(n-1) Current Navigation Window Hot Regions Data Set Driven Strategy m(n) m(n+1) m(n-2) Vector Strategies 9

Xmdv. Tool Implementation OFF-LINE PROCESS Used: – – – C/C++ TCL/TK Open. GL Oracle

Xmdv. Tool Implementation OFF-LINE PROCESS Used: – – – C/C++ TCL/TK Open. GL Oracle 8 i Pro*C Min. Max Labeling DB DB DB Schema Info Loader Translator CACHE Hierarchical Data User Rewriter Exploration. Buffer Variables Queries GUI Prefetcher Library: Buffer ON-LINE PROCESS Flat Data Estimator Random Direction Focus Mean EWA 10

Evaluation of Prefetching Strategies • Setup: – Testbed: Xmdv. Tool freeware system for ndimensional

Evaluation of Prefetching Strategies • Setup: – Testbed: Xmdv. Tool freeware system for ndimensional exploration – User Traces: • Synthetic user traces with varying # of hot regions, % directionality, average delay between user requests • Real user traces collected by a user study • Study effect of different navigation patterns: – # hot regions – erratic vs. directional – delay between user requests 11

Focus strategy best as # hot regions increases Prefetching improves response time 12

Focus strategy best as # hot regions increases Prefetching improves response time 12

Random Strategy – best for erratic traces. Direction Strategy – best for directional traces.

Random Strategy – best for erratic traces. Direction Strategy – best for directional traces. 13

Prefetcher performance improves and plateaus as delay between user operations increases. Prefetcher performance improved

Prefetcher performance improves and plateaus as delay between user operations increases. Prefetcher performance improved up to 28%. Recall: Caching improved response time by 85% over no caching. 14

What Can We Conclude? • Focus: hot region calculation overhead • Mean and EWA:

What Can We Conclude? • Focus: hot region calculation overhead • Mean and EWA: offers more than needed • Direction: simple, no prior knowledge required NOTE: • Our experiments on real user traces show that real users are highly directional If only one strategy can be chosen, select Directional Prefetching. 15

Related Work • Integrated visualization-database systems -Tioga, IDEA, DEVise [have not used caching and

Related Work • Integrated visualization-database systems -Tioga, IDEA, DEVise [have not used caching and prefetching] • Prefetching research -- mostly on (1) web prefetching, (2) prefetching for memory caches by OS, (3) I/O prefetching. [no prefetching research for visualization apps] 16

Contributions • Identified key characteristics of visualization tools exploitable for optimizing data access performance

Contributions • Identified key characteristics of visualization tools exploitable for optimizing data access performance • Developed, implemented and tested prefetching strategies in Xmdv. Tool • Shown that caching coupled with prefetching at client-side improves data access performance – Caching reduces response time by 85% over no-caching. – Prefetching further improves response time by 28% over no-prefetching. 17

Future Work No single prefetcher works best for all types of user navigation patterns

Future Work No single prefetcher works best for all types of user navigation patterns Adaptive Prefetching (preliminary results show that this further improves response time and reduces prediction errors, at a minimal overhead cost). 18

Thank You Xmdv. Tool Homepage: http: //davis. wpi. edu/~xmdv@cs. wpi. edu Code is free

Thank You Xmdv. Tool Homepage: http: //davis. wpi. edu/~xmdv@cs. wpi. edu Code is free for research and education. Contact author: rundenst@cs. wpi. edu 19