ColumnStores vs RowStores How Different are they Really

  • Slides: 64
Download presentation
Column-Stores vs. Row-Stores How Different are they Really? Daniel J. Abadi, Samuel Madden, and

Column-Stores vs. Row-Stores How Different are they Really? Daniel J. Abadi, Samuel Madden, and Nabil Hachem, SIGMOD 2008 Presented By, Paresh Modak(09305060) Souman 1 Mandal(09305066)

Contents Column-store introduction Column-store data model Emulation of Column Store in Row store Column

Contents Column-store introduction Column-store data model Emulation of Column Store in Row store Column store optimization Experiment and Results Conclusion 2

Row Store and Column Store Figure taken form [2] In row store data are

Row Store and Column Store Figure taken form [2] In row store data are stored in the disk tuple by tuple. Where in column store data are stored in the disk column by column 3

Row Store and Column Store Most of the queries does not process all the

Row Store and Column Store Most of the queries does not process all the attributes of a particular relation. For example the query Select c. name and c. address From CUSTOMES as c Where c. region=Mumbai; Only process three attributes of the relation CUSTOMER. But the customer relation can have more than three attributes. Column-stores are more I/O efficient for read-only queries as they read, only those attributes which are accessed by 4 a query.

Row Store and Column Store Row Store Column Store (+) Easy to add/modify a

Row Store and Column Store Row Store Column Store (+) Easy to add/modify a record (+) Only need to read in relevant data (-) Might read in unnecessary data (-) Tuple writes require multiple accesses So column stores are suitable for read-mostly, readintensive, large data repositories 5

Why Column Stores? Can be significantly faster than row stores for some applications Fetch

Why Column Stores? Can be significantly faster than row stores for some applications Fetch only required columns for a query Better cache effects Better compression (similar attribute values within a column) But can be slower for other applications OLTP with many row inserts, . . Long war between the column store and row store camps : -) 6 This paper tries to give a balanced picture of advantages and disadvantages, after adding/ subtracting a number of optimizations for each approach

Column Stores - Data Model Standard relational logical data model EMP(name, age, salary, dept)

Column Stores - Data Model Standard relational logical data model EMP(name, age, salary, dept) DEPT(dname, floor) Table – collection of projections Projection – set of columns Horizontally partitioned into segments with segment identifier 7

Column Stores - Data Model To answer queries, projections are joined using Storage keys

Column Stores - Data Model To answer queries, projections are joined using Storage keys and join indexes Storage Keys: 8 Within a segment, every data value of every column is associated with a unique Skey Values from different columns with matching Skey belong to the same logical row

Column Stores – Data Model Join Indexes T 1 and T 2 are projections

Column Stores – Data Model Join Indexes T 1 and T 2 are projections on T M segments in T 1 and N segments in T 2 Join Index from T 1 to T 2 is a table of the form: 9 (s: Segment ID in T 2, k: Storage key in Segment s) Each row in join index matches corresponding row in T 1 Join indexes are built such that T could be efficiently reconstructed from T 1 and T 2

Column Stores – Data Model Construct EMP(name, age, salary) from EMP 1 and EMP

Column Stores – Data Model Construct EMP(name, age, salary) from EMP 1 and EMP 3 using join index on EMP 3 10

Compression Trades I/O for CPU Increased column-store opportunities: Higher data value locality in column

Compression Trades I/O for CPU Increased column-store opportunities: Higher data value locality in column stores Techniques such as run length encoding far more useful Schemes 11 Null Suppression Dictionary encoding Run Length encoding Bit-Vector encoding Heavyweight schemes

Query Execution - Operators Select: Same as relational algebra, but produces a bit string

Query Execution - Operators Select: Same as relational algebra, but produces a bit string Project: Same as relational algebra Join: Joins projections according to predicates Aggregation: SQL like aggregates Sort: Sort all columns of a projection 12

Query Execution - Operators Decompress: Converts compressed column to uncompressed representation Mask(Bitstring B, Projection

Query Execution - Operators Decompress: Converts compressed column to uncompressed representation Mask(Bitstring B, Projection Cs) => emit only those values whose corresponding bits are 1 Concat: Combines one or more projections sorted in the same order into a single projection Permute: Permutes a projection according to the ordering defined by a join index Bitstring operators: Band – Bitwise AND, Bor – Bitwise OR, Bnot – complement 13

Row Store Vs Column Store How much of the buzz around column-stores is marketing

Row Store Vs Column Store How much of the buzz around column-stores is marketing hype? Do you really need to buy Sybase IQ or Vertica? How far will your current row-store take you? Can you get column-store performance from a rowstore? Can you simulate a column-store in a row-store? 14

Row Store Vs Column Store Now the simplistic view about the difference in storage

Row Store Vs Column Store Now the simplistic view about the difference in storage layout leads to that one can obtain the performance benefits of a column-store using a rowstore by making some changes to the physical structure of the row store. This changes can be 15 Vertically partitioning Using index-only plans Using materialized views

Vertical Partitioning Process: Full Vertical partitioning of each relation Each column =1 Physical table

Vertical Partitioning Process: Full Vertical partitioning of each relation Each column =1 Physical table This can be achieved by adding integer position column to every table Adding integer position is better than adding primary key Join on Position for multi column fetch Problems: “Position” - Space and disk bandwidth Header for every tuple – further space wastage 16 e. g. 24 byte overhead in Postgre. SQL

Vertical Partitioning: Example 17

Vertical Partitioning: Example 17

Index-only plans Process: Add B+Tree index for every Table. column Plans never access the

Index-only plans Process: Add B+Tree index for every Table. column Plans never access the actual tuples on disk Headers are not stored, so per tuple overhead is less Problem: Separate indices may require full index scan, which is slower Eg: SELECT AVG(salary) FROM emp WHERE age > 40 18 Composite index with (age, salary) key helps.

Index-only plans: Example 19

Index-only plans: Example 19

Materialized Views Process: Create ‘optimal' set of MVs for given query workload Objective: Provide

Materialized Views Process: Create ‘optimal' set of MVs for given query workload Objective: Provide just the required data Avoid overheads Performs better Expected to perform better than other two approach Problems: 20 Practical only in limited situation Require knowledge of query workloads in advance

Materialized Views: Example Select F. cust. ID from Facts as F where F. price>20

Materialized Views: Example Select F. cust. ID from Facts as F where F. price>20 21

Optimizing Column oriented Execution Different optimization for column oriented database 22 Compression Late Materialization

Optimizing Column oriented Execution Different optimization for column oriented database 22 Compression Late Materialization Block Iteration Invisible Join

Compression Low information entropy (high data value locality) leads to High compression ratio Advantage

Compression Low information entropy (high data value locality) leads to High compression ratio Advantage Disk Space is saved Less I/O CPU cost decrease if we can perform operation without decompressing Light weight compression schemes do better 23

Compression If data is sorted on one column that column will be super-compressible in

Compression If data is sorted on one column that column will be super-compressible in row store eg. Run length encoding Figure taken form [2] 24

Late Materialization Most query results entity-at-a-time not column-at-atime So at some point of time

Late Materialization Most query results entity-at-a-time not column-at-atime So at some point of time multiple column must be combined One simple approach is to join the columns relevant for a particular query But further performance can be improve using latematerialization 25

Late Materialization Delay Tuple Construction Might avoid constructing it altogether Intermediate position lists might

Late Materialization Delay Tuple Construction Might avoid constructing it altogether Intermediate position lists might need to be constructed Eg: SELECT R. a FROM R WHERE R. c = 5 AND R. b = 10 26 Output of each predicate is a bit string Perform Bitwise AND Use final position list to extract R. a

Late Materialization Advantages 27 Unnecessary construction of tuple is avoided Direct operation on compressed

Late Materialization Advantages 27 Unnecessary construction of tuple is avoided Direct operation on compressed data Cache performance is improved (PAX)

N-ary storage model (NSM) 28

N-ary storage model (NSM) 28

Decomposition Storage Model(DSM) High degree of spatial locality for sequential access of single attribute

Decomposition Storage Model(DSM) High degree of spatial locality for sequential access of single attribute Performance deteriorates significantly for queries that involve multiple attributes 29

PAX - Partition Attributes Across* Cache utilization and performance is very important In-page data

PAX - Partition Attributes Across* Cache utilization and performance is very important In-page data placement is the key to high cache performance PAX groups together all values of each attribute within each page Only affects layout inside the pages, incurs no storage penalty and does not affect I/O behavior Exhibits superior cache performance and utilization over traditional methods * A. Ailamaki, D. J. De. Witt, et. al. “Weaving relations for cache performance” VLDB, 2001. 30

PAX Model Maximizes interrecord spatial locality within each column in the page Incurs minimal

PAX Model Maximizes interrecord spatial locality within each column in the page Incurs minimal record reconstruction cost Orthogonal to other design decisions because it only affects the layout of data stored on a 31 single page

Block Iteration Operators operate on blocks of tuples at once Iterate over blocks rather

Block Iteration Operators operate on blocks of tuples at once Iterate over blocks rather than tuples Like batch processing If column is fixed width, it can be operated as an array Minimizes per-tuple overhead Exploits potential for parallelism Can be applied even in Row stores – IBM DB 2 implements it 32

Star Schema Benchmark SSBM is a data warehousing benchmark derived from TPC-H It consist

Star Schema Benchmark SSBM is a data warehousing benchmark derived from TPC-H It consist of a single fact table LINE-ORDER There are four dimension table. CUSTOMER PART SUPPLIER DATE LINEORDER table consist of 60, 000 tuples SSBM consist of thirteen queries divided into four category 33

Star Schema Benchmark 34 Figure taken form [1]

Star Schema Benchmark 34 Figure taken form [1]

Invisible Join Queries over data warehouse (particularly modeled with star schema) often have following

Invisible Join Queries over data warehouse (particularly modeled with star schema) often have following structure Restrict set of tuple in the fact table using selection predicates on dimension table Perform some aggregation on the restricted fact table Often grouping by other dimension table attribute For each selection predicate and for each aggregate grouping join between fact table and dimension table is required 35

Invisible Join 36 Find Total revenue from Asian customers who purchase a product supplied

Invisible Join 36 Find Total revenue from Asian customers who purchase a product supplied by an Asian supplier between 1992 and 1997 grouped by nation of the customer, supplier and year of transaction

Invisible Join Traditional plan for this type of query is to pipeline join in

Invisible Join Traditional plan for this type of query is to pipeline join in order of predicate selectivity Alternate plan is late materialized join technique But both have disadvantages 37 Traditional plan lacks all the advantages described previously of late materialization In the late materialized join technique group by columns need to be extracted in out-of-position order

Invisible Join Invisible join is a late materialized join but minimize the values that

Invisible Join Invisible join is a late materialized join but minimize the values that need to be extracted out of order Invisible join 38 Rewrite joins into predicates on the foreign key columns in the fact table These predicates evaluated either by hash-lookup Or by between-predicate rewriting

Invisible Join Phase 1 39 Figure taken form [1]

Invisible Join Phase 1 39 Figure taken form [1]

Invisible Join 40 Phase 2 Figure taken form [1]

Invisible Join 40 Phase 2 Figure taken form [1]

Invisible Join 41 Phase 3 Figure taken form [1]

Invisible Join 41 Phase 3 Figure taken form [1]

Invisible Join Between-Predicate rewriting 42 Use of range predicates instead of hash lookup in

Invisible Join Between-Predicate rewriting 42 Use of range predicates instead of hash lookup in phase 1 Useful if contiguous set of keys are valid after applying a predicate Dictionary encoding for key reassignment if not contiguous Query optimizer is not altered. Predicate is rewritten at runtime

Between-predicate rewriting Figure taken form [2] 43

Between-predicate rewriting Figure taken form [2] 43

Experiments Goal 44 Comparison of attempts to emulate a column store in a row-store

Experiments Goal 44 Comparison of attempts to emulate a column store in a row-store with baseline performance of C-Store Is it possible for an unmodified row-store to obtain the benefits of column oriented design Effect of different optimization technique in column-store

Experiment setup Environment 2. 8 GHz Dual Core Pentium(R) workstation 3 GB RAM RHEL

Experiment setup Environment 2. 8 GHz Dual Core Pentium(R) workstation 3 GB RAM RHEL 5 4 disk array mapped as a single logical volume Reported numbers are average of several runs Warm buffer (30% improvement for both systems) 45 Data read exceeds the size of buffer pool

C-Store Vs Commercial row oriented DB Figure taken form [1] RS: Base System X

C-Store Vs Commercial row oriented DB Figure taken form [1] RS: Base System X CS: Base C-Store case RS (MV): System X with optimal collection of MVs System X= Commercial CS (Row-MV): Column store constructed from RS(MV) row oriented database 46

Results and Analysis From the graph we can see C-Store out performs System X

Results and Analysis From the graph we can see C-Store out performs System X by a Factor of six in the base case Factor of three when System x use materialized view However CS (Row-MV) perform worse than RS (MV) System X provide advance performance feature C-Store has multiple known performance bottleneck 47 C-Store doesn't support Partitioning, multithreading

Column Store simulation in Row Store Partitioning improves the performance of row store if

Column Store simulation in Row Store Partitioning improves the performance of row store if done on a predicate of the query Authors found it improve the speed by a factor of two System X implement star join Optimizer will bloom filters if it feels necessary Other configuration parameter 48 32 KB disk pages 1. 5 GB maximum memory for sort joins, intermediate result 500 MB buffer pool

Different configuration of System X Experimented with five different configuration 1. 2. 3. 4.

Different configuration of System X Experimented with five different configuration 1. 2. 3. 4. 5. 49 Traditional row oriented representation with bitmap and bloom filter Traditional (bitmap): Biased to use bitmaps; might be inferior sometimes Vertical Partitioning: Each column is a relation Index-Only: B+Tree on each column Materialized Views: Optimal set of views for every query

Different configuration of System X Figure taken form [1] 50

Different configuration of System X Figure taken form [1] 50

Different configuration of System X Figure taken form [1] 51

Different configuration of System X Figure taken form [1] 51

Different configuration of System X T – Traditional T(B) – Traditional(bitmap) MV – materialized

Different configuration of System X T – Traditional T(B) – Traditional(bitmap) MV – materialized views VP – vertical partitioning AI – All indexes • Better performance of traditional system is because of partitioning. • Partitioning on orderdate Figure taken form [1] 52

Different configuration of System X Materialized view performs best Index only plans are the

Different configuration of System X Materialized view performs best Index only plans are the worst Expensive column joins on fact table 53 System X use hash join by default Nested loop join, merge join also doesn’t help

Column Store simulation in Row Store: Analysis Tuple overheads: Line. Order Table – 60

Column Store simulation in Row Store: Analysis Tuple overheads: Line. Order Table – 60 million tuples, 17 columns Compressed data 8 bytes of over head per row 4 bytes of record-id 1 column Whole table RS 0. 7 -1. 1 GB 4 GB CS 240 MB 2. 3 GB For SSBM scale 10 lineorder table 54

Column Store simulation in Row Store: Analysis Query selectivity is 8. 0 x 10

Column Store simulation in Row Store: Analysis Query selectivity is 8. 0 x 10 -3 55 Method Time Traditional 43 Vertical partitioning 65 Index-only plans 360

Column Store simulation in Row Store: Analysis Traditional Vertical Partitioning Scans the entire lineorder

Column Store simulation in Row Store: Analysis Traditional Vertical Partitioning Scans the entire lineorder table Hash joins with dwdate, part and supplier Hash-joins partkey column with the filtered part table Hash-joins suppkey column with filtered supplier table Hash join the result of the two above join Index-Only plans 56 Access all columns through unclustred B+Tree indexes

Column Store Performence Column Store perform better than the best case of row store

Column Store Performence Column Store perform better than the best case of row store (4. 0 sec Vs 10. 2 sec) Though they access the same amount of I/O is similar 57

Tuple overhead and Join costs Row Store Column Store the record-id explicitly Don’t explicitly

Tuple overhead and Join costs Row Store Column Store the record-id explicitly Don’t explicitly Store the record-id Headers are stored with each column Header are stored in separate column Use index-based merge join Use merge join This differences are not fundamental 58

Breakdown of Column-Store Advantages Block processing improves the performance by a factor of 5%

Breakdown of Column-Store Advantages Block processing improves the performance by a factor of 5% to 50% Compression improves the performance by almost a factor of two on avg Late materialization improves performance by almost a factor of three Invisible join improves the performance by 50 -75% 59

Breakdown of Column-Store Advantages Figure taken form [1] T=tuple-at-a-time processing, ; t=block processing; I=invisible

Breakdown of Column-Store Advantages Figure taken form [1] T=tuple-at-a-time processing, ; t=block processing; I=invisible join enabled; i=disabled; C=compression enabled, c=disabled; L=late materialization enabled; l=disabled; 60

Conclusion To emulate column store in row store, techniques like Vertical portioning Index only

Conclusion To emulate column store in row store, techniques like Vertical portioning Index only plan does not yield good performance High per-tuple overheads, high tuple reconstruction cost are the reason Where in column store Late materialization Compression Block iteration Invisible join are the reason for good performance 61

Conclusion Successful emulation column store in row store require 62 Virtual record-ids Reduced tuple

Conclusion Successful emulation column store in row store require 62 Virtual record-ids Reduced tuple over head Fast merge join Run length encoding across multiple tuples Operating directly on compressed data Block processing Invisible join Late materialization

References 1. “Column-stores vs. row-stores: how different are they really? ” Daniel J. Abadi,

References 1. “Column-stores vs. row-stores: how different are they really? ” Daniel J. Abadi, Samuel Madden, Nabil Hachem: SIGMOD Conference 2008: 967 -980 2. Column-Oriented Database Systems, VLDB 2009 Tutorial; Stavros Harizopoulos, Daniel Abadi, Peter Boncz 63

Thank You! 64

Thank You! 64