The Big DAWG Polystore System Database Challenges Enterprises
The Big. DAWG Polystore System
Database Challenges • Enterprises encounter many databases and data models. • Specialized systems provide performance, but add complexity.
Database Challenges • Enterprises encounter many databases and data models. • Specialized systems provide performance, but add complexity. • Big. DAWG goals: – – Provide as much location (database) transparency as possible Support a single query notation and interface with limited extensions Big. DAWG
Big. DAWG Design Many “Sizes” Support for heterogeneous storage and database engines Low Latency Support for real time streaming databases for Internet of things Location Allow users to operate on data without explicit Transparency knowledge of location Semantic Support the widest number of database completeness operations with efficient connectors
Big. DAWG Design Many “Sizes” Support for heterogeneous storage and database engines Low Latency Support for real time streaming databases for Internet of things Location Allow users to operate on data without explicit Transparency knowledge of location Semantic Support the widest number of database completeness operations with efficient connectors
Big. DAWG Design Many “Sizes” Support for heterogeneous storage and database engines Low Latency Support for real time streaming databases for Internet of things Location Allow users to operate on data without explicit Transparency knowledge of location Semantic Support the widest number of database completeness operations with efficient connectors
Big. DAWG Design Many “Sizes” Support for heterogeneous storage and database engines Low Latency Support for real time streaming databases for Internet of things Location Allow users to operate on data without explicit Transparency knowledge of location Semantic Support the widest number of database completeness operations with efficient connectors
Semantic Islands as the Tradeoff • Islands are the trade-off between functionality and location transparency. • Islands have: - A Data Model - A Language or Set of Operators - A Set of Candidate Database Engines
Semantic Islands as the Tradeoff • Islands are the trade-off between functionality and location transparency. User specifies the Island: RELATIONAL(select avg(temp) from device) • Islands have: - A Data Model - A Language or Set of Operators - A Set of Candidate Database Engines ARRAY(multiply(A, B))
Semantic Islands as the Tradeoff • Islands are the trade-off between functionality and location transparency. User specifies the Island: RELATIONAL(select avg(temp) from device) • Islands have: - A Data Model ARRAY(multiply(A, B)) - A Language or Set of Operators - A Set of Candidate Database Engines * Islands do Intersection of engines * Big. DAWG does Union of Islands * Islands are logical
Hackathon to Prototype Big. DAWG • Big. DAWG Goal: Harness the power of advanced database engines through a unified interface • Big. DAWG is the vision of the ISTC Big Data to develop future technologies and interfaces that support knowledge extraction big data • Recent Hackathon at MIT Beaver. Works produced a Big. DAWG prototype
Using Big. DAWG Polystore for Medical Big Data • Data Explorer • Tell Me Something Interesting • Text Analytics • Heavy Analytics • Streaming Analytics
Big DAWG Prototype - Island Types -Text Analytics. D 4 M -Explorer. Scala. R -Tell Something. See. DB Searchlight -Heavy Analytic. Myria -Streaming. S-Store S-PI -Watch. Wearables S-PI Client Big DAWG API Server Islands Engines D 4 M Associative Arrays Postgre. SQL Tabular Clinical Data Sci. DB Historical Waveform Data. Model Data Model Island (i. e. ARRAY, TEX) (i. e. ARRAY, TEXT) Accumulo Text Clinical Data (i. e. chart notes) Myria (Iterative) Streams Myria. X S-Store Intermediate results Streaming Waveform Data
- Slides: 13