Columnstore Indexes Questions and Answers Matan Yungman A

  • Slides: 34
Download presentation
Columnstore Indexes Questions and Answers Matan Yungman

Columnstore Indexes Questions and Answers Matan Yungman

A Few Words about Me… @Matan. Yungman 2 Madeira. Data. com SQLServer. Radio. com

A Few Words about Me… @Matan. Yungman 2 Madeira. Data. com SQLServer. Radio. com Image courtesy of Mister GC / Free. Digital. Photos. net

Agenda • What is Columnstore? • How is it built? • What is it

Agenda • What is Columnstore? • How is it built? • What is it good for? • How to make your benchmark successful? • The future

Traditional Approach • Row Store … • Data is stored row-wise • Each page

Traditional Approach • Row Store … • Data is stored row-wise • Each page contains data from 1 or more rows

Columnstore C 1 • Each page contains data of a single column • Data

Columnstore C 1 • Each page contains data of a single column • Data is highly compressed • Because of repeating values • Additional tricks • More data fits in memory • Each column fetched independently • Only needed columns are fetched • Can dramatically decrease IO C 2 C 3 C 4

Columnstore • Row Group – 1 million “rows” • Column Segments • • •

Columnstore • Row Group – 1 million “rows” • Column Segments • • • Contains values from 1 column Each segment contains 100 k-~1 M rows Row Compressed Group Each segment is stored on separate LOB Segment is the unit of transfer from disk Segment C 1 C 2 C 3 C 4 C 5 C 6

Batch-Mode Processing • Data processed in vectors of 1000 rows • Manipulated in the

Batch-Mode Processing • Data processed in vectors of 1000 rows • Manipulated in the CPU L 1 cache • Much more efficient than row-by-row • Better parallelism • Better algorithms for processing data

SQL Server 2012 Gotchas • Only non-clustered columnstore • Table is not updateable •

SQL Server 2012 Gotchas • Only non-clustered columnstore • Table is not updateable • Can be bypassed with partitioning (kinda) • Can be bypassed with Union All (kinda) • Hard to get batch mode processing Image source: https: //upload. wikimedia. org/wikipedia/commons/thumb/7/79/Oops_Stop_Sign_icon. svg/480 px-Oops_Stop_Sign_icon. svg. png

SQL Server 2014 • New Clustered Columnstore Index • “Updateable” • Easier to get

SQL Server 2014 • New Clustered Columnstore Index • “Updateable” • Easier to get Batch Mode • Significant space savings • And even more with archival compression • No other indexes allowed • No constraints allowed CREATE CLUSTERED COLUMNSTORE INDEX CCI ON My. Table

How is a Columnstore Index Built?

How is a Columnstore Index Built?

Separate Horizontally (Row Groups)

Separate Horizontally (Row Groups)

Separate Vertically (Segments)

Separate Vertically (Segments)

Compress Image source: https: //www. flickr. com/photos/peasap/4684467836

Compress Image source: https: //www. flickr. com/photos/peasap/4684467836

Compression – Value Scale Height 180 30 150 0 175 25 183 33 200

Compression – Value Scale Height 180 30 150 0 175 25 183 33 200 50 190 40 181 41

Compression – Bit Array Name Michael Guy Maria Matan Michael 1 0 0 0

Compression – Bit Array Name Michael Guy Maria Matan Michael 1 0 0 0 Guy 0 1 0 0 Maria 0 0 1 0 Matan 0 0 0 1 Michael 1 0 0 0

Compression – Run-Length Name Michael: 3 Guy: 1 Maria: 1 Matan: 2 Matan Michael

Compression – Run-Length Name Michael: 3 Guy: 1 Maria: 1 Matan: 2 Matan Michael

Compression – Dictionary Encoding Name ID Name Michael Guy Maria Matan Name 1 Michael

Compression – Dictionary Encoding Name ID Name Michael Guy Maria Matan Name 1 Michael 1 Guy 2 Maria 3 Matan 4 2 3 4 4 Michael 1

And More. .

And More. .

How is Insert Implemented? Tuple Mover Delta Store Column Store

How is Insert Implemented? Tuple Mover Delta Store Column Store

How is Delete Implemented? Column Store Delete Bitmap

How is Delete Implemented? Column Store Delete Bitmap

How is Update Implemented? Delete Bitmap Delta Store

How is Update Implemented? Delete Bitmap Delta Store

How is it Rebuilt? • ALTER INDEX. . REBUILD • Recreates the Clustered Columnstore

How is it Rebuilt? • ALTER INDEX. . REBUILD • Recreates the Clustered Columnstore index • Use MAXDOP 1 to improve segment elimination • Build on top of an existing Clustered Index for further optimization • ALTER INDEX. . REORGANIZE • Compresses closed row groups • Use COMPRESS_ALL_ROW_GROUPS=ON to compress all row groups • Use when Tuple-Mover can’t handle the load Image sourcehttps: //farm 4. staticflickr. com/3776/11690157976_7020609 c 77_o_d. jpg

Why is it So Fast? • Compression • Batch Mode • Only needed columns

Why is it So Fast? • Compression • Batch Mode • Only needed columns are fetched • Segment Elimination • Improved IO operations • Aggressive read-ahead

What is it good for? • Big aggregation/fact tables • Wide tables when there’s

What is it good for? • Big aggregation/fact tables • Wide tables when there’s no option to cover all queries • Tables with many repetitive values and low change rate

Demo

Demo

Columnstore is not a magic solution for bad design or bad code!!

Columnstore is not a magic solution for bad design or bad code!!

How to make your benchmark successful? • Don’t neglect database design • Beware of

How to make your benchmark successful? • Don’t neglect database design • Beware of textual columns • Don’t rely on Tuple Mover • Take care of rebuild and reorganize if needed • Load data using • Bulk Insert • Partition Switch • • Delete data using partition switch Partitioning key helps in segment elimination Building on top of Clustered Index helps ordering New Cardinality Estimator can be a factor

Hybrid scenarios? • Nonclustered Columnstore • Indexed View on top of Clustered Columnstore •

Hybrid scenarios? • Nonclustered Columnstore • Indexed View on top of Clustered Columnstore • Regular table + Clustered Columnstore • Duplication? We’ve been doing this for years

SQL Server 2016 • Many performance improvements • Regular indexes on top of Clustered

SQL Server 2016 • Many performance improvements • Regular indexes on top of Clustered Columnstore • Constrains on Clustered Columnstore • Updateable Non-Clustered Columnstore • Columnstore on top of In-Memory OLTP

The Future? • Rebuild with order on column • Probably next version • Segments

The Future? • Rebuild with order on column • Probably next version • Segments maintenance – merge and deletes • Next version • Segment Elimination info in Execution Plan • Already up on Azure SQL Database

The Future? • Batch Mode for row store • Indexes on Delta Stores •

The Future? • Batch Mode for row store • Indexes on Delta Stores • More than one Columnstore Index on table • For better segment elimination • Not so lazy tuple mover(s) • Rebuild with order on column

Resources • Niko Neugebar’s Columnstore Series • http: //www. nikoport. com/columnstore/ • And his

Resources • Niko Neugebar’s Columnstore Series • http: //www. nikoport. com/columnstore/ • And his SQLBits Presentations • 24 Hours of PASS SQL Server 2014 Launch Edition • http: //www. sqlpass. org/24 hours/2014/ss 2014 launch/Sessions. aspx • Sunil Agarwal’s Columnstore Session • Jimmy May’s Session

Questions? Image courtesy of Master isolated images / Free. Digital. Photos. net

Questions? Image courtesy of Master isolated images / Free. Digital. Photos. net

Image courtesy of David Castillo Dominici / Free. Digital. Photos. net

Image courtesy of David Castillo Dominici / Free. Digital. Photos. net