Viz DB A tool to support Exploration of
Viz. DB A tool to support Exploration of large databases By using Human Visual System To analyze mid-size to large data
Data Mining Techniques l Implements several data mining techniques l Pixel-oriented Techniques (Spiral, Axes, and Grouping Techniques) l Parallel Coordinates l Stick Figures Exploration of unto a million data values
Concept l The basic idea for visualizing the data is to map the distances to colors and represent each data value by one or multiple colored pixels. l Interactivity is the key !
Requirement l Feedback required when query returns unexpected results l Interactivity allows immediate feedback from a modified query l Configurable tool, that allows various forms of data visualization techniques l Using the human vision system for pattern recognition
Basic Technique l l l Sort query data w. r. t. the relevance and map relevance factors to colors Highest relevance factor in the center Yellow-Green-Blue-Red-Black in decreasing order of relevance. Separate window for each selection predicate in the query Multiple windows make multi-dimensional visualization
Mapping 2 -D To The Axes l Visualization of inherently 2 D or 3 D data is not dealt with in Viz. DB l Where no inherent 2 D semantics of data exist, Viz. DB is a valuable tool. Use of two axes for two dimensions. Positive as well as negative values displayed. l Some space may be wasted. . (Why? )
Grouping l Each area is arranged in a rectangular spiral shape according to relevance factors l Coloring is similar to the previous method l Grouping allows data similar in one dimension to be grouped together. Data in multiple dimensions are represented as clusters of pixels l Good for larger dimensionality
Interactive Data Exploration Dynamic Query Modification Techniques l Feedback on the results l – Change in color means change in values that are “relevant” – Change in structure means overall distribution of data has changed Sliders for discrete as well as continuous values l Initial Query is SQL or “Gradi” l
Calibrations l Calculation of “relevance” factor can be calibrated by the user l Starting and ending values for various numeric data – Eg: Blood samples count
What about complex queries? l Multiple layers of windows for complex queries using nested AND and OR operators l Data that satisfies ALL joins is yellow. The rest is colored according to number of criteria met l Works well with the relational databases
Implementations l C++ with Motif using X Windows on HP 7 xx l Currently being ported to Linux (I couldn’t get this working! )
Adding new techniques l More Info Viz. Techniques can be integrated with the system. New l Latest version supports Parallel coordinates, Stick Figures, Pan and zoom Stuff !! techniques
Applications l Molecular Biology - to find possible docking regions by identifying sets surface points with distinct characteristics. l Database of geographical data l Environmental Data l NASA Earth observation data
Future Work l Automatic generation of queries that Cool !! correspond to data in specific regions (Select some data, and the SQL query that matches that data will get generated. . l Time series visualization
Thank You The presentation slides are available at http: //filebox. vt. edu/users/adatey/research/Viz DB. ppt A small color picture that shows different techniques http: //filebox. vt. edu/users/adatey/research/Vis DBHandout. eps
- Slides: 15