EECS 262 a Advanced Topics in Computer Systems

  • Slides: 43
Download presentation
EECS 262 a Advanced Topics in Computer Systems Lecture 17 C-Store / DB Cracking

EECS 262 a Advanced Topics in Computer Systems Lecture 17 C-Store / DB Cracking March 28 th, 2016 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http: //www. eecs. berkeley. edu/~kubitron/cs 262

Today’s Papers • C-Store: A Column-oriented DBMS* Mike Stonebraker, Daniel J. Abadi, Adam Batkin,

Today’s Papers • C-Store: A Column-oriented DBMS* Mike Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth O’Neil, Pat O’Neil, Alex Rasin, Nga Tran, Stan Zdonik. Appears in Proceedings of the ACM Conference on Very Large Databases(VLDB), 2005 • Database Cracking+ Stratos Idreos, Martin L. Kersten, and Stefan Manegold. Appears in the 3 rd Biennial Conference on Innovative Data Systems Research (CIDR), January 7 -10, 2007 • Wednesday: Comparison of PDBMS, CS, MR • Thoughts? *Some slides based on slides from Jianlin Feng, School of Software, Sun Yat-Sen University +Some slides based on slides from Stratos Idreos, CWI Amsterdam, The Netherlands 3/28/16 cs 262 a-S 16 Lecture-17 2

Relational Data: Logical View Name Age Department Salary Bob 25 Math 10 K Bill

Relational Data: Logical View Name Age Department Salary Bob 25 Math 10 K Bill 27 EECS 50 K Jill 24 Biology 80 K 3/28/16 cs 262 a-S 16 Lecture-17 3

C-Store: A Column-oriented DBMS • Physical layout: Row store or Column store Record 1

C-Store: A Column-oriented DBMS • Physical layout: Row store or Column store Record 1 Record 2 Column 1 2 3 Record 3 Relation or Tables 3/28/16 cs 262 a-S 16 Lecture-17 4

Row Stores • On-Line Transaction Processing (OLTP) – ATM, POS in supermarkets • Characteristics

Row Stores • On-Line Transaction Processing (OLTP) – ATM, POS in supermarkets • Characteristics of OLTP applications: – – Transactions that involve small numbers of records (or tuples) Frequent updates (including queries) Many users Fast response times • OLTP needs a write-optimized row store – Insert and delete a record in one physical write • Easy to add new record, but might read unnecessary data (wasted memory and I/O bandwidth) 3/28/16 cs 262 a-S 16 Lecture-17 5

Row Store: Columns Stored Together Data Rid = (i, N) Page i Rid =

Row Store: Columns Stored Together Data Rid = (i, N) Page i Rid = (i, 2) Rid = (i, 1) Slot Array 20 N . . . 16 2 SLOT DIRECTORY 24 1 N # slots Pointer to start of free space Record id = <page id, slot #> 3/28/16 cs 262 a-S 16 Lecture-17 6

Current DBMS Gold Standard • Store columns from one record contiguously on disk •

Current DBMS Gold Standard • Store columns from one record contiguously on disk • Use B-tree indexing • Use small (e. g. 4 K) disk blocks • Align fields on byte or word boundaries • Conventional (row-oriented) query optimizer and executor (technology from 1979) • Aries-style transactions 3/28/16 cs 262 a-S 16 Lecture-17 7

OLAP and Data Warehouses • On-Line Analytical Processing (OLAP) – Flexible reporting for Business

OLAP and Data Warehouses • On-Line Analytical Processing (OLAP) – Flexible reporting for Business Intelligence • Characteristics of OLAP applications – – Transactions that involve large numbers of records Frequent Ad-hoc queries and infrequent updates A few decision-making users Fast response times • Data Warehouses facilitate reporting and analysis – Read-Mostly • Other read-mostly applications – Customer Relationship Management (Siebel, Oracle) – Catalog Search in E-Commerce (Amazon. com, Bestbuy. com) 3/28/16 cs 262 a-S 16 Lecture-17 8

Column Stores • Logical data model: Relational Model • Key Intuition: Only read relevant

Column Stores • Logical data model: Relational Model • Key Intuition: Only read relevant columns – Example: Ad-hoc queries read 2 columns out of 20 • Multiple prior column store implementations – – Sybase IQ (early ’ 90 s, bitmap index) Addamark (i. e. , Sen. Sage, for Event Log data warehouse) KDB (Column-stores for financial services companies) Monet. DB (Hyper-Pipelining Query Execution, CIDR’ 05) • Only read necessary data, but might need multiple seeks 3/28/16 cs 262 a-S 16 Lecture-17 9

Read-Optimized Databases SQL Server DB 2 Oracle 1 Joe 45 2 Sue 37 ……

Read-Optimized Databases SQL Server DB 2 Oracle 1 Joe 45 2 Sue 37 …… … row stores Sybase IQ Monet. DB KDB C-Store 1 2 … Joe Sue … 45 37 … column stores • Effect of column-orientation on performance? – Read-less, seek more so depends on prefetching, query selectivity, tuple width, competing query traffic 3/28/16 cs 262 a-S 16 Lecture-17 11

Rows versus Columns row stores column stores Joe 45 reconstruct project 1 Joe 45

Rows versus Columns row stores column stores Joe 45 reconstruct project 1 Joe 45 2 Sue 37 …… … single file 3 files Joe seek 1 Joe 45 2 Sue … 37 … 3/28/16 cs 262 a-S 16 Lecture-17 45 … 12

C-Store Technical Ideas • Column Store with some “novel” ideas (below) • Only materialized

C-Store Technical Ideas • Column Store with some “novel” ideas (below) • Only materialized views on each relation (perhaps many) • Active data compression • Column-oriented query executor and optimizer Writeable Store (WS) Tuple Mover Read-optimized Store (RS) • Shared-nothing architecture • Replication-based concurrency control and recovery 3/28/16 cs 262 a-S 16 Lecture-17 13

Architecture of Vertica C-Store 3/28/16 cs 262 a-S 16 Lecture-17 14

Architecture of Vertica C-Store 3/28/16 cs 262 a-S 16 Lecture-17 14

Basic Concepts • A logical table is physically represented as a set of projections

Basic Concepts • A logical table is physically represented as a set of projections • Each projection consists of a set of columns – Columns are stored separately, along with a common sort order defined by SORT KEY • Each column appears in at least one projection • A column can have different sort orders if it is stored in multiple projections 3/28/16 cs 262 a-S 16 Lecture-17 15

Example C-Store Projection • LINEITEM(shipdate, quantity, retflag, suppkey | shipdate, quantity, retflag) – First

Example C-Store Projection • LINEITEM(shipdate, quantity, retflag, suppkey | shipdate, quantity, retflag) – First sorted by shipdate – Second sorted by quantity – Third sorted by retflag • Sorting increases locality of data – Favors compression techniques such as Run. Length Encoding (see Elephant paper) 3/28/16 cs 262 a-S 16 Lecture-17 16

C-Store Operators • Selection – Produce bitmaps that can be efficiently combined • Mask

C-Store Operators • Selection – Produce bitmaps that can be efficiently combined • Mask – Materialize a set of values from a column and a bitmap • Permute – Reorder a column using a join index • Projection – Free operation! – Two columns in the same order can be concatenated for free • Join – Produces positions rather than values 3/28/16 cs 262 a-S 16 Lecture-17 17

Example: Join over Two Columns 3/28/16 cs 262 a-S 16 Lecture-17 18

Example: Join over Two Columns 3/28/16 cs 262 a-S 16 Lecture-17 18

Column Store has High Compressibility • Each attribute is stored in a separate column

Column Store has High Compressibility • Each attribute is stored in a separate column – Related values are compressible (versus values of separate attributes) • Compression benefits – Reduces the data sizes – Improves disk (and memory) I/O performance by: » reducing seek times (related data stored nearer together) » reducing transfer times (less data to read/write) » increasing buffer hit rate (buffer can hold larger fraction of data) 3/28/16 cs 262 a-S 16 Lecture-17 19

Compression Methods • Dictionary … ‘low’ … … 00 … … ‘high’ … …

Compression Methods • Dictionary … ‘low’ … … 00 … … ‘high’ … … 10 … … ‘low’ … … 00 … … ‘normal’ … … 01 … • Bit-pack – Pack several attributes inside a 4 -byte word – Use as many bits as max-value • Delta – Base value per page – Arithmetic differences • No Run-Length Encoding (unlike Elephant paper) 3/28/16 cs 262 a-S 16 Lecture-17 20

C-Store Use of Snapshot Isolation • Snapshot Isolation for Fast OLAP/Data Warehousing – Allows

C-Store Use of Snapshot Isolation • Snapshot Isolation for Fast OLAP/Data Warehousing – Allows very fast transactions without locks – Can read large consistent snapshot of database • Divide into RS and WS stores – Read Store is Optimized for Fast Read-Only Transactions – Write Store is Optimized for Transactional Updates • Low Water Mark (LWM) – Represents earliest epoch at which read-only transactions can run – RS contains tuples added before LWM • High Water Mark (HWM) – Represents latest epoch at which read-only transactions can run 3/28/16 cs 262 a-S 16 Lecture-17 21

Other Ideas • K-safety: Can handle up to K-1 failures – Every piece of

Other Ideas • K-safety: Can handle up to K-1 failures – Every piece of data replicated K times – Different projections sorted in different ways • Join Tables – Construct original tuples given covering projects – Vertica gave up on Join Tables – too expensive, require super-projections 3/28/16 cs 262 a-S 16 Lecture-17 22

Evaluation? • Series of 7 queries against C-Store vs two commercial DBMS – C-Store

Evaluation? • Series of 7 queries against C-Store vs two commercial DBMS – C-Store faster In all cases, sometimes significantly • Why so much faster? – – 3/28/16 Column Representation – avoid extra reads Overlapping projections – multiple orderings of column as appropriate Better data compression Query operators operating on compressed data cs 262 a-S 16 Lecture-17 23

Summary • Columns outperform rows in listed workloads – Reasons: » Column representation avoids

Summary • Columns outperform rows in listed workloads – Reasons: » Column representation avoids reads of unused attributes » Query operators operate on compressed representation, mitigating storage barrier problem » Avoids memory-bandwidth bottleneck in rows • Storing overlapping projections, rather than the whole table allows storage of multiple orderings of a column as appropriate • Better compression of data allows more orderings in the same space • Results from other papers: – Prefetching is key to columns outperforming rows – Systems with competing traffic favor columns 3/28/16 cs 262 a-S 16 Lecture-17 24

Is this a good paper? • What were the authors’ goals? • What about

Is this a good paper? • What were the authors’ goals? • What about the evaluation/metrics? • Did they convince you that this was a good system/approach? • Were there any red-flags? • What mistakes did they make? • Does the system/approach meet the “Test of Time” challenge? • How would you review this paper today? 3/28/16 cs 262 a-S 16 Lecture-17 25

BREAK

BREAK

DB Physical Organization Problem • Many DBMS perform well and efficiently only after being

DB Physical Organization Problem • Many DBMS perform well and efficiently only after being tuned by a DBA • DBA decides: – Which indexes to build? – On which data parts? – and when to build them? 3/28/16 cs 262 a-S 16 Lecture-17 27

Timeline • Sample workload • Analyze performance • Prepare estimated physical design • Queries

Timeline • Sample workload • Analyze performance • Prepare estimated physical design • Queries Very complex and time consuming process! What about: • Dynamic, changing workloads? • Very Large Databases? 3/28/16 cs 262 a-S 16 Lecture-17 28

Database Cracking • Solve challenges of dynamic environments: – Remove all tuning and physical

Database Cracking • Solve challenges of dynamic environments: – Remove all tuning and physical design steps, but still get similar performance as a fully tuned system • How? • Design new auto-tuning kernels – DBA with cracking (operators, plans, structures, etc. ) 3/28/16 cs 262 a-S 16 Lecture-17 29

Database Cracking • • • No monitoring No preparation No external tools No full

Database Cracking • • • No monitoring No preparation No external tools No full indexes No human involvement • Continuous on-the-fly physical reorganization – Partial, incremental, adaptive indexing • Designed for modern column-stores 3/28/16 cs 262 a-S 16 Lecture-17 30

Cracking Example • Each query is treated as an advice on how data should

Cracking Example • Each query is treated as an advice on how data should be stored – Triggers physical re-organization of the database Column A 13 16 4 Q 1: select * from R where R. A > 10 and R. A < 14 9 2 12 7 1 19 3 14 11 8 6 3/28/16 cs 262 a-S 16 Lecture-17 31

Cracking Design • The first time a range query is posed on an attribute

Cracking Design • The first time a range query is posed on an attribute A, a cracking DBMS makes a copy of column A, called the cracker column of A • A cracker column is continuously physically reorganized based on queries that need to touch attribute such as the result is in a contiguous space • For each cracker column, there is a cracker index 3/28/16 cs 262 a-S 16 Lecture-17 32

Cracking Algorithms • Two types of cracking algorithms based on select’s where clause where

Cracking Algorithms • Two types of cracking algorithms based on select’s where clause where X < # Split a piece into two new pieces 3/28/16 where # < X < # Split a piece into three new pieces cs 262 a-S 16 Lecture-17 33

Cracker Select Operator • Traditional select operator – Scans the column – Returns a

Cracker Select Operator • Traditional select operator – Scans the column – Returns a new column that contains the qualifying values • The cracker select operator – – Searches the cracker index Physically re-organizes the pieces found Updates the cracker index Returns a slice of the cracker column as the result • More steps but faster because analyzes less data 3/28/16 cs 262 a-S 16 Lecture-17 34

Cracking Example • Each query is treated as advice on how data should be

Cracking Example • Each query is treated as advice on how data should be stored – Physically reorganize based on the selection predicate Q 1: select * from R where R. A > 10 and R. A < 14 3/28/16 Column A Cracker Column A 13 4 16 9 4 2 9 7 2 1 12 3 7 8 1 6 19 13 3 12 14 11 11 16 8 19 6 14 cs 262 a-S 16 Lecture-17 Piece 1: A ≤ 10 Piece 2: 10 < A < 14 Piece 3: 14 ≤ A 35

Cracking Example • Each query is treated as advice on how data should be

Cracking Example • Each query is treated as advice on how data should be stored – Physically reorganize based on the selection predicate Q 1: select * from R where R. A > 10 and R. A < 14 3/28/16 Column A Cracker Column A 13 4 16 9 4 2 9 7 2 1 12 3 7 8 1 6 19 13 3 12 14 11 11 16 8 19 6 14 cs 262 a-S 16 Lecture-17 Piece 1: A ≤ 10 Piece 2: 10 < A < 14 Piece 3: 14 ≤ A 36

Cracking Example • Each query is treated as advice Improve data for on howaccess

Cracking Example • Each query is treated as advice Improve data for on howaccess data should future queries be stored – Physically reorganize based on the selection predicate Q 1: select * from R where R. A > 10 and R. A < 14 3/28/16 Column A Cracker Column A 13 4 16 9 4 2 Gain knowledge on 9 how to organize data 2 on-the-fly within 12 the select-operator 7 1 A ≤ 10 3 7 8 1 6 19 13 3 12 14 11 11 16 8 19 6 14 cs 262 a-S 16 Lecture-17 Piece 1: Piece 2: Result tu 10 < A < 14 Piece 3: 14 ≤ A 37

Cracking Example • Each query is treated as advice on how data should be

Cracking Example • Each query is treated as advice on how data should be stored – Physically reorganize based on the selection predicate Column A Cracker Column A Q 1: select * from R 13 4 where R. A > 10 16 9 4 2 9 7 2 1 and R. A < 14 12 7 Q 2: select * from R where R. A > 7 and R. A ≤ 16 3/28/16 Q 1 Piece 1: A ≤ 10 3 8 1 6 19 13 3 12 14 11 11 16 8 19 6 14 cs 262 a-S 16 Lecture-17 Piece 2: 10 < A < 14 Piece 3: 14 ≤ A 38

Cracking Example • Each query is treated as advice on how data should be

Cracking Example • Each query is treated as advice on how data should be stored – Physically reorganize based on the selection predicate Column A Cracker Column A Q 1: select * from R 13 4 4 where R. A > 10 16 9 2 4 2 9 7 2 1 and R. A < 14 12 7 Q 2: select * from R where R. A > 7 and R. A ≤ 16 3/28/16 Q 1 3 8 1 6 19 13 3 12 14 11 11 16 8 19 6 14 cs 262 a-S 16 Lecture-17 Piece 1: A ≤ 10 Q 2 Piece 2: 10 < A < 14 Piece 3: 14 ≤ A 1 3 6 Piece 1: A≤ 7 7 9 Piece 2: 8 7 < A ≤ 10 13 12 11 Piece 3: 10<A<14 14 Piece 4: 16 14≤A≤ 16 19 Piece 5: 16 < A 39

Cracking Example • Each query is treated as advice on how data should be

Cracking Example • Each query is treated as advice on how data should be stored – Physically reorganize based on the selection predicate Column A Cracker Column A Q 1: select * from R 13 4 4 where R. A > 10 16 9 2 4 2 9 7 2 1 and R. A < 14 12 7 Q 2: select * from R where R. A > 7 and R. A ≤ 16 3/28/16 Q 1 3 8 1 6 19 13 3 12 14 11 11 16 8 19 6 14 cs 262 a-S 16 Lecture-17 Piece 1: A ≤ 10 Q 2 Piece 2: 10 < A < 14 Piece 3: 14 ≤ A 1 3 6 Piece 1: A≤ 7 7 9 Piece 2: 8 7 < A ≤ 10 13 12 11 Piece 3: 10<A<14 14 Piece 4: 16 14≤A≤ 16 19 Piece 5: 16 < A 40

Cracking Example • Each query is treated as advice The more cracking, on how

Cracking Example • Each query is treated as advice The more cracking, on how should the data more learned – Physically reorganize based on the selection predicate be stored Column A Cracker Column A Q 1: select * from R 13 4 4 where R. A > 10 16 9 2 4 2 9 7 2 1 and R. A < 14 12 7 Q 2: select * from R where R. A > 7 and R. A ≤ 16 3/28/16 Q 1 3 8 1 6 19 13 3 12 14 11 11 16 8 19 6 14 cs 262 a-S 16 Lecture-17 Result Piece 1: A ≤ 10 tuples Q 2 Piece 2: 10 < A < 14 Piece 3: 14 ≤ A 1 3 6 Piece 1: A≤ 7 7 9 Piece 2: 8 7 < A ≤ 10 13 12 11 Piece 3: 10<A<14 14 Piece 4: 16 14≤A≤ 16 19 Piece 5: 16 < A 41

Self-Organizing Behavior (Count(*) range query) 3/28/16 cs 262 a-S 16 Lecture-17 42

Self-Organizing Behavior (Count(*) range query) 3/28/16 cs 262 a-S 16 Lecture-17 42

Self-Organizing Behavior (TPC-H Query 6) • TPC-H is an ad-hoc, decision support benchmark –

Self-Organizing Behavior (TPC-H Query 6) • TPC-H is an ad-hoc, decision support benchmark – Business oriented ad-hoc queries, concurrent data modifications • Example: – Tell me the amount of revenue increase that would have resulted from eliminating certain company-wide discounts in a given percentage range in a given year • Workload: – Database load – Execution of 22 read-only queries in both single and multi-user mode – Execution of 2 refresh functions 3/28/16 cs 262 a-S 16 Lecture-17 43

Is this a good paper? • What were the authors’ goals? • What about

Is this a good paper? • What were the authors’ goals? • What about the evaluation/metrics? • Did they convince you that this was a good system/approach? • Were there any red-flags? • What mistakes did they make? • Does the system/approach meet the “Test of Time” challenge? • How would you review this paper today? 3/28/16 cs 262 a-S 16 Lecture-17 44