Prefetching for Visual Data Exploration Punit R Doshi



















- Slides: 19
Prefetching for Visual Data Exploration Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward Computer Science Department Worcester Polytechnic Institute Support: NSF grants IIS-9732897, EIA-9729878, and IIS-0119276.
Overview • Why visually explore data? – Fact: Increasing data set sizes – Need: Efficient techniques for exploring the data – Possible solution: Interactive Data Visualization -- humans can detect certain patterns better and faster than data mining tools • Why cache and prefetch? – Interactive data visualization tools do not scale well – Interactive real-time response needed – Caching and prefetching improve response time. • Goal: Propose and evaluate prefetching for visualization tools 2
Example Visual Exploration Tool: Xmdv. Tool Flat Display Data Hierarchy Hierarchical Display 3
Example Visual Exploration Tool: Xmdv. Tool Drill Down: Structure-Based Brush 1 Parallel Coordinates (Linked with Brush 1) Roll-Up: Structure-Based Brush 2 4 Parallel Coordinates (Linked with Brush 2)
Characteristics of a Visualization Environment Characteristics that can be exploited for caching and prefetching: • Locality of exploration • Contiguity of user movements • Idle time due to user viewing display Move up/down Move left/right 5
Overview of Semantic Caching • Purpose • reduce response time and network traffic • Issues • visual query cannot directly translate into object IDs ® high-level cache specification to avoid complete scans • Semantic Caching: queries are cached rather than objects • minimize cost of cache lookup • dynamically adapt cached queries to patterns of queries GUI cache Client machine DB Server machine 6
In Xmdv. Tool, caching reduced response time by 85% Prefetching can further improve response time. 7
Prefetching • Locality of exploration • Contiguity of user movements • Idle time due to user viewing display User’s next request can be predicted with high accuracy Time to prefetch Fetchin g Idle time New user query Cache Prefetching DB 8
Prefetching Strategies Direction Strategy Random Strategy 1/4 1/4 (m-1) m Mean Strategy (m+1) m(n-1) 1/4 m(n) m(n+1) m(n-2) Localized Speculative Strategies Exponential Weight Average Strategy Focus Strategy m(n-1) Current Navigation Window Hot Regions Data Set Driven Strategy m(n) m(n+1) m(n-2) Vector Strategies 9
Xmdv. Tool Implementation OFF-LINE PROCESS Used: – – – C/C++ TCL/TK Open. GL Oracle 8 i Pro*C Min. Max Labeling DB DB DB Schema Info Loader Translator CACHE Hierarchical Data User Rewriter Exploration. Buffer Variables Queries GUI Prefetcher Library: Buffer ON-LINE PROCESS Flat Data Estimator Random Direction Focus Mean EWA 10
Evaluation of Prefetching Strategies • Setup: – Testbed: Xmdv. Tool freeware system for ndimensional exploration – User Traces: • Synthetic user traces with varying # of hot regions, % directionality, average delay between user requests • Real user traces collected by a user study • Study effect of different navigation patterns: – # hot regions – erratic vs. directional – delay between user requests 11
Focus strategy best as # hot regions increases Prefetching improves response time 12
Random Strategy – best for erratic traces. Direction Strategy – best for directional traces. 13
Prefetcher performance improves and plateaus as delay between user operations increases. Prefetcher performance improved up to 28%. Recall: Caching improved response time by 85% over no caching. 14
What Can We Conclude? • Focus: hot region calculation overhead • Mean and EWA: offers more than needed • Direction: simple, no prior knowledge required NOTE: • Our experiments on real user traces show that real users are highly directional If only one strategy can be chosen, select Directional Prefetching. 15
Related Work • Integrated visualization-database systems -Tioga, IDEA, DEVise [have not used caching and prefetching] • Prefetching research -- mostly on (1) web prefetching, (2) prefetching for memory caches by OS, (3) I/O prefetching. [no prefetching research for visualization apps] 16
Contributions • Identified key characteristics of visualization tools exploitable for optimizing data access performance • Developed, implemented and tested prefetching strategies in Xmdv. Tool • Shown that caching coupled with prefetching at client-side improves data access performance – Caching reduces response time by 85% over no-caching. – Prefetching further improves response time by 28% over no-prefetching. 17
Future Work No single prefetcher works best for all types of user navigation patterns Adaptive Prefetching (preliminary results show that this further improves response time and reduces prediction errors, at a minimal overhead cost). 18
Thank You Xmdv. Tool Homepage: http: //davis. wpi. edu/~xmdv@cs. wpi. edu Code is free for research and education. Contact author: rundenst@cs. wpi. edu 19