Positional Update Handling in Column Stores Sndor Hman

  • Slides: 29
Download presentation
Positional Update Handling in Column Stores Sándor Héman Marcin Zukowski Niels Nes Lefteris Sidirourgos

Positional Update Handling in Column Stores Sándor Héman Marcin Zukowski Niels Nes Lefteris Sidirourgos Peter Boncz

UPDATE IN PLACE: A Poison Apple? Jim Gray, 1981 “. . for performance reasons,

UPDATE IN PLACE: A Poison Apple? Jim Gray, 1981 “. . for performance reasons, most disc-based systems have been seduced into updating the data in place. ” 30 years of hardware improvements in sequential/throughput beating random/latency…. in-place less feasible every year. alternative: differential approach. In column stores, in-place updating is by now clearly infeasible

Problem: Column Store Updates • I/O proportional to number of attributes – I/O blocks

Problem: Column Store Updates • I/O proportional to number of attributes – I/O blocks large and compressed – Sometimes even replicated – Read-Optimized Update-Unfriendly • Table often kept ordered on sort-key (SK) attributes – Uniform update load scattered write access

Solution: Differential Structure • Maintain updates (INS/DEL/MOD) in a differential structure – Merge with

Solution: Differential Structure • Maintain updates (INS/DEL/MOD) in a differential structure – Merge with base table during scan

Solution: Differential Structure • Maintain updates (INS/DEL/MOD) in a differential structure – Merge with

Solution: Differential Structure • Maintain updates (INS/DEL/MOD) in a differential structure – Merge with base table during scan • Challenges: – Efficiently maintainable data-structure – Minimize Merge impact for read-only queries

Naïve Approach: Delta Tables • For each table, maintain two update friendly row-store tables:

Naïve Approach: Delta Tables • For each table, maintain two update friendly row-store tables: – INS(C 1. . Cn) – DEL(SK 1. . SKm) – MOD = DEL + INS Base table: inventory Sort-Key (SK): [store, prod] store prod new qty Berlin chair Y 5 Berlin cloth Y 20 qty London stool N 10 London table N 20 Paris rug N 1 Paris stool N 5 Inserts table: INS store new Deletes table: DEL store Paris prod rug

Naïve Approach: Delta Tables • Rewrite table scans: Merge. Union[store, prod](Scan(INS), Merge. Diff[store, prod](

Naïve Approach: Delta Tables • Rewrite table scans: Merge. Union[store, prod](Scan(INS), Merge. Diff[store, prod]( Scan(Inventory), Scan(DEL))) Base table: inventory Sort-Key (SK): [store, prod] store prod new qty Berlin chair Y 5 Berlin cloth Y 20 qty London stool N 10 London table N 20 Paris rug N 1 Paris stool N 5 Inserts table: INS store new Deletes table: DEL store Paris prod rug

Naïve Approach: Delta Tables • Rewrite table scans: Merge. Union[store, prod](Scan(INS), Merge. Diff[store, prod](

Naïve Approach: Delta Tables • Rewrite table scans: Merge. Union[store, prod](Scan(INS), Merge. Diff[store, prod]( Scan(Inventory), Scan(DEL))) for up-to-date image • Expensive! Actual table: inventory Sort-Key (SK): [store, prod] store prod new qty Berlin chair Y 5 Berlin cloth Y 20 London stool N 10 – I/O to scan SK ‘merge’ London table N columns; also if query Paris stool N does not need SK cols – Each query pays CPU effort to locate the same change positions over and over again 20 5

The Idea: Positional Updates • Remember the position of an update rather than its

The Idea: Positional Updates • Remember the position of an update rather than its SK values – Merge once at write Read-Optimized approach – No need to scan SK columns – Scan skip less CPU overhead Notation: • TABLEx state of TABLE at time x • SID(t): Stable. ID – Position of tuple t in immutable base TABLE 0 Stable • RIDx(t): Row. ID – Position of visible tuple t at time x VOLATILE! – SID(t) = RID 0(t)

SID/RID Example SID STORE PROD NEW QTY RID 0 London chair N 30 0

SID/RID Example SID STORE PROD NEW QTY RID 0 London chair N 30 0 1 London stool N 10 1 2 London table N 20 2 3 Paris rug N 1 3 4 Paris stool N 5 4 INSERT INTO inventory VALUES(‘Berlin’, ‘table’, Y, 10) INSERT INTO inventory VALUES(‘Berlin’, ‘cloth’, Y, 20) INSERT INTO inventory VALUES(‘Berlin’, ‘chair’, Y, 5) TABLE 0 SID STORE PROD NEW QTY 0 Berlin chair Y 5 0 0 Berlin cloth Y 20 1 0 Berlin table Y 10 2 0 London chair N 30 3 1 London stool N 10 4 2 London table N 20 5 3 Paris rug N 1 6 4 Paris stool N 5 7 TABLE 1 RID

SIDs and RIDs • RID(t) • ∆(t) = SID(t) + ∆(t) = #inserts before

SIDs and RIDs • RID(t) • ∆(t) = SID(t) + ∆(t) = #inserts before t – #deletes before t = RID(t) – SID(t) • SID and RID are monotonically increasing – organize positional updates on SID in a counting B-Tree that keeps track cumulative deltas (∆) • Positional Delta Tree (PDT) – SIDs are stable – Only need to maintain cumulative ∆ on path root leaf

PDT Example SID STORE PROD NEW QTY RID 0 London chair N 30 0

PDT Example SID STORE PROD NEW QTY RID 0 London chair N 30 0 1 London stool N 10 1 2 London table N 20 2 3 Paris rug N 1 3 4 Paris stool N 5 4 SID 0 INSERT INTO inventory VALUES(‘Berlin’, ‘table’, Y, 10) INSERT INTO inventory VALUES(‘Berlin’, ‘cloth’, Y, 20) INSERT INTO inventory VALUES(‘Berlin’, ‘chair’, Y, 5) TABLE 0 ∆ 2 Insert Value Table 1 STORE SID 0 0 SID 0 ins i 0 type ins type value i 2 i 1 value PROD NEW QTY i 0 Berlin table Y 10 i 1 Berlin cloth Y 20 i 2 Berlin chair Y 5

PDT Example SID STORE PROD NEW QTY RID 0 Berlin chair Y 5 0

PDT Example SID STORE PROD NEW QTY RID 0 Berlin chair Y 5 0 0 Berlin cloth Y 20 1 0 Berlin table Y 10 2 0 London chair N 30 3 1 London stool N 10 4 2 London table N 20 5 3 Paris rug N 1 6 4 Paris stool N 5 7 SID 0 DELETE FROM inventory WHERE store = ‘Berlin’ AND prod = ‘table’ DELETE FROM inventory WHERE store = ‘Paris’ AND prod = ‘rug’ TABLE 1 ∆ 2 Insert Value Table -1 STORE SID 0 0 SID 3 del d 0 type ins type value i 2 i 1 value PROD NEW QTY i 0 Berlin table Y 10 i 1 Berlin cloth Y 20 i 2 Berlin chair Y 5

PDT Example SID STORE PROD NEW QTY RID 0 Berlin chair Y 5 0

PDT Example SID STORE PROD NEW QTY RID 0 Berlin chair Y 5 0 0 Berlin cloth Y 20 1 0 London chair N 30 2 1 London stool N 10 3 2 London table N 20 4 4 Paris stool N 5 5 SID 0 INSERT INTO inventory VALUES (‘Paris’, ‘rack’, Y, 4) Insert at RID = 5 TABLE 2 ∆ 2 Insert Value Table -1 STORE RID 5 > 0 + 2 SID 0 0 SID 3 3 ins del di 00 d 0 type ins type value i 2 i 1 value PROD NEW QTY i 0 Berlin Paris table rack Y 20 4 i 1 Berlin cloth Y 5 20 i 2 Berlin chair Y 10 5

PDT Example INSERT INTO inventory VALUES (‘London’, ‘rack’, Y, 4) INSERT INTO inventory VALUES

PDT Example INSERT INTO inventory VALUES (‘London’, ‘rack’, Y, 4) INSERT INTO inventory VALUES (‘Berlin’, ‘rack’, Y, 4) SID ∆ 1 3 Separator SIDs Subtree ∆ 1 ∆ 3 Running ∆ RID 4 Separator RIDs SID 0 2 1 ∆ 3 1 0 ∆ 2 ∆ 4 RID 2 RID 7 SID 0 0 SID 1 SID 3 3 type ins type ins del value i 2 i 1 value i 4 value i 3 value i 0 d 0 ∆ 0 1 ∆ 2 ∆ 3 ∆ 4 5 RID 0 1 RID 2 RID 4 RID 7 8

Stacking PDTs • Arbitrary number of layers: “deltas on. . ” – RID domain

Stacking PDTs • Arbitrary number of layers: “deltas on. . ” – RID domain of child PDT = SID domain of parent PDT generalization: lo • PDThi contains all differences in time [lo, hi] PDT t 2 t 3 t 0 vs PDT t 1 are consecutive t 2=t 1 PDT PDT t 2 PDT t 3 t 1 PDT PDT t 2 PDT PDT t 1 Table t 0

Stacking PDTs • Arbitrary number of layers: “deltas on. . ” – RID domain

Stacking PDTs • Arbitrary number of layers: “deltas on. . ” – RID domain of child PDT = SID domain of parent PDT generalization: lo • PDThi contains all differences in time [lo, hi] PDT t 2 t 3 t 0 vs PDT t 1 are consecutive aligned t 2=t 1 t 2=t 0 “same base” PDT Table

Stacking PDTs • Arbitrary number of layers: “deltas on. . ” – RID domain

Stacking PDTs • Arbitrary number of layers: “deltas on. . ” – RID domain of child PDT = SID domain of parent PDT generalization: lo • PDThi contains all differences in time [lo, hi] PDT t 2 t 3 t 0 vs PDT t 1 are consecutive aligned t 2=t 1 PDT t 2=t 0 “same base” overlapping [t 2, t 3] overlaps [t 0, t 1] PDT “uncomparable” / “incompatible” Table

Stacking for Isolation • ‘lock’ PDT down for further updates – Immutable read-PDT BIG:

Stacking for Isolation • ‘lock’ PDT down for further updates – Immutable read-PDT BIG: main memory resident • ‘stack’ empty PDT on top – Updateable write-PDT SMALL: L 2 cache resident – Note: PDTs are consecutive • once in a while changes are propagated – Propagate() operation • Requires consecutive PDTs TABLEx Write-PDT Propagate() Read-PDT Stable Table

Snapshot Isolation Transaction State Propagate() Trans PDT Copy Write-PDT TABLEx Write-PDT Read-PDT Stable Table

Snapshot Isolation Transaction State Propagate() Trans PDT Copy Write-PDT TABLEx Write-PDT Read-PDT Stable Table • Transaction creates snapshot copy of write -PDT • Updates go into trans. PDT • On commit, Propagate() trans-PDT into write-PDT

Optimistic Concurrency Control • Two concurrent transactions Trans A Trans PDT Trans B Trans

Optimistic Concurrency Control • Two concurrent transactions Trans A Trans PDT Trans B Trans PDT Copy Write-PDT TABLEx Write-PDT Read-PDT Stable Table

Optimistic Concurrency Control • Two concurrent transactions • A commits before B Trans A

Optimistic Concurrency Control • Two concurrent transactions • A commits before B Trans A Trans PDT Pr op Copy ag Write-PDT ate Trans B Trans PDT () TABLEx Write-PDT Read-PDT Stable Table Copy Write-PDT

Optimistic Concurrency Control Trans A Trans PDT Trans B Trans ) PDT e( t

Optimistic Concurrency Control Trans A Trans PDT Trans B Trans ) PDT e( t a ag Copy p o Pr Write-PDT TABLEx Write-PDT Read-PDT Stable Table • Two concurrent transactions • A commits before B • Can not commit B into modified write-PDT! – A changed RID enumeration

Optimistic Concurrency Control Trans PDT Trans A Trans PDT Consecutive! Trans B Trans PDT

Optimistic Concurrency Control Trans PDT Trans A Trans PDT Consecutive! Trans B Trans PDT Serialize() • Two concurrent transactions • A commits before B • Can not commit B into modified write-PDT! – A changed RID enumeration • Serialize(A, B) TABLEx Write-PDT Read-PDT Stable Table – Makes aligned PDTs consecutive – MAY FAIL!! trans abort = succeeds if no conflict = write set intersection

Optimistic Concurrency Control Trans B Trans PDT Serialize() Pr o pa ga te ()

Optimistic Concurrency Control Trans B Trans PDT Serialize() Pr o pa ga te () Trans A Trans PDT TABLEx Write-PDT Read-PDT Stable Table • Two concurrent transactions • A commits before B • Can not commit B into modified write-PDT! – A changed RID enumeration • Serialize(A, B) – Makes aligned PDTs consecutive – MAY FAIL!! trans abort = succeeds if no conflict = write set intersection Extend to any number of concurrent transactions by serializing against all PDTs of transactions that committed during its lifetime (a. k. a. backward looking OCC)

Concluding. . • PDTs speed-up differential update merging – Reduced I/O volume – Reduced

Concluding. . • PDTs speed-up differential update merging – Reduced I/O volume – Reduced CPU merge overhead • Tree structure – logarithmic lookup & maintenance of volatile RIDs – main operations: Merge(), Propagate(), Serialize() • PDTs are stackable, and capture Write-Set – Great structure for Snapshot Isolation • Formal definitions, algorithms and benchmarks in paper

Thank you!

Thank you!

Microbenchmarks

Microbenchmarks

TPCH-30

TPCH-30