Positional Update Handling in Column Stores Sndor Hman
- Slides: 29
Positional Update Handling in Column Stores Sándor Héman Marcin Zukowski Niels Nes Lefteris Sidirourgos Peter Boncz
UPDATE IN PLACE: A Poison Apple? Jim Gray, 1981 “. . for performance reasons, most disc-based systems have been seduced into updating the data in place. ” 30 years of hardware improvements in sequential/throughput beating random/latency…. in-place less feasible every year. alternative: differential approach. In column stores, in-place updating is by now clearly infeasible
Problem: Column Store Updates • I/O proportional to number of attributes – I/O blocks large and compressed – Sometimes even replicated – Read-Optimized Update-Unfriendly • Table often kept ordered on sort-key (SK) attributes – Uniform update load scattered write access
Solution: Differential Structure • Maintain updates (INS/DEL/MOD) in a differential structure – Merge with base table during scan
Solution: Differential Structure • Maintain updates (INS/DEL/MOD) in a differential structure – Merge with base table during scan • Challenges: – Efficiently maintainable data-structure – Minimize Merge impact for read-only queries
Naïve Approach: Delta Tables • For each table, maintain two update friendly row-store tables: – INS(C 1. . Cn) – DEL(SK 1. . SKm) – MOD = DEL + INS Base table: inventory Sort-Key (SK): [store, prod] store prod new qty Berlin chair Y 5 Berlin cloth Y 20 qty London stool N 10 London table N 20 Paris rug N 1 Paris stool N 5 Inserts table: INS store new Deletes table: DEL store Paris prod rug
Naïve Approach: Delta Tables • Rewrite table scans: Merge. Union[store, prod](Scan(INS), Merge. Diff[store, prod]( Scan(Inventory), Scan(DEL))) Base table: inventory Sort-Key (SK): [store, prod] store prod new qty Berlin chair Y 5 Berlin cloth Y 20 qty London stool N 10 London table N 20 Paris rug N 1 Paris stool N 5 Inserts table: INS store new Deletes table: DEL store Paris prod rug
Naïve Approach: Delta Tables • Rewrite table scans: Merge. Union[store, prod](Scan(INS), Merge. Diff[store, prod]( Scan(Inventory), Scan(DEL))) for up-to-date image • Expensive! Actual table: inventory Sort-Key (SK): [store, prod] store prod new qty Berlin chair Y 5 Berlin cloth Y 20 London stool N 10 – I/O to scan SK ‘merge’ London table N columns; also if query Paris stool N does not need SK cols – Each query pays CPU effort to locate the same change positions over and over again 20 5
The Idea: Positional Updates • Remember the position of an update rather than its SK values – Merge once at write Read-Optimized approach – No need to scan SK columns – Scan skip less CPU overhead Notation: • TABLEx state of TABLE at time x • SID(t): Stable. ID – Position of tuple t in immutable base TABLE 0 Stable • RIDx(t): Row. ID – Position of visible tuple t at time x VOLATILE! – SID(t) = RID 0(t)
SID/RID Example SID STORE PROD NEW QTY RID 0 London chair N 30 0 1 London stool N 10 1 2 London table N 20 2 3 Paris rug N 1 3 4 Paris stool N 5 4 INSERT INTO inventory VALUES(‘Berlin’, ‘table’, Y, 10) INSERT INTO inventory VALUES(‘Berlin’, ‘cloth’, Y, 20) INSERT INTO inventory VALUES(‘Berlin’, ‘chair’, Y, 5) TABLE 0 SID STORE PROD NEW QTY 0 Berlin chair Y 5 0 0 Berlin cloth Y 20 1 0 Berlin table Y 10 2 0 London chair N 30 3 1 London stool N 10 4 2 London table N 20 5 3 Paris rug N 1 6 4 Paris stool N 5 7 TABLE 1 RID
SIDs and RIDs • RID(t) • ∆(t) = SID(t) + ∆(t) = #inserts before t – #deletes before t = RID(t) – SID(t) • SID and RID are monotonically increasing – organize positional updates on SID in a counting B-Tree that keeps track cumulative deltas (∆) • Positional Delta Tree (PDT) – SIDs are stable – Only need to maintain cumulative ∆ on path root leaf
PDT Example SID STORE PROD NEW QTY RID 0 London chair N 30 0 1 London stool N 10 1 2 London table N 20 2 3 Paris rug N 1 3 4 Paris stool N 5 4 SID 0 INSERT INTO inventory VALUES(‘Berlin’, ‘table’, Y, 10) INSERT INTO inventory VALUES(‘Berlin’, ‘cloth’, Y, 20) INSERT INTO inventory VALUES(‘Berlin’, ‘chair’, Y, 5) TABLE 0 ∆ 2 Insert Value Table 1 STORE SID 0 0 SID 0 ins i 0 type ins type value i 2 i 1 value PROD NEW QTY i 0 Berlin table Y 10 i 1 Berlin cloth Y 20 i 2 Berlin chair Y 5
PDT Example SID STORE PROD NEW QTY RID 0 Berlin chair Y 5 0 0 Berlin cloth Y 20 1 0 Berlin table Y 10 2 0 London chair N 30 3 1 London stool N 10 4 2 London table N 20 5 3 Paris rug N 1 6 4 Paris stool N 5 7 SID 0 DELETE FROM inventory WHERE store = ‘Berlin’ AND prod = ‘table’ DELETE FROM inventory WHERE store = ‘Paris’ AND prod = ‘rug’ TABLE 1 ∆ 2 Insert Value Table -1 STORE SID 0 0 SID 3 del d 0 type ins type value i 2 i 1 value PROD NEW QTY i 0 Berlin table Y 10 i 1 Berlin cloth Y 20 i 2 Berlin chair Y 5
PDT Example SID STORE PROD NEW QTY RID 0 Berlin chair Y 5 0 0 Berlin cloth Y 20 1 0 London chair N 30 2 1 London stool N 10 3 2 London table N 20 4 4 Paris stool N 5 5 SID 0 INSERT INTO inventory VALUES (‘Paris’, ‘rack’, Y, 4) Insert at RID = 5 TABLE 2 ∆ 2 Insert Value Table -1 STORE RID 5 > 0 + 2 SID 0 0 SID 3 3 ins del di 00 d 0 type ins type value i 2 i 1 value PROD NEW QTY i 0 Berlin Paris table rack Y 20 4 i 1 Berlin cloth Y 5 20 i 2 Berlin chair Y 10 5
PDT Example INSERT INTO inventory VALUES (‘London’, ‘rack’, Y, 4) INSERT INTO inventory VALUES (‘Berlin’, ‘rack’, Y, 4) SID ∆ 1 3 Separator SIDs Subtree ∆ 1 ∆ 3 Running ∆ RID 4 Separator RIDs SID 0 2 1 ∆ 3 1 0 ∆ 2 ∆ 4 RID 2 RID 7 SID 0 0 SID 1 SID 3 3 type ins type ins del value i 2 i 1 value i 4 value i 3 value i 0 d 0 ∆ 0 1 ∆ 2 ∆ 3 ∆ 4 5 RID 0 1 RID 2 RID 4 RID 7 8
Stacking PDTs • Arbitrary number of layers: “deltas on. . ” – RID domain of child PDT = SID domain of parent PDT generalization: lo • PDThi contains all differences in time [lo, hi] PDT t 2 t 3 t 0 vs PDT t 1 are consecutive t 2=t 1 PDT PDT t 2 PDT t 3 t 1 PDT PDT t 2 PDT PDT t 1 Table t 0
Stacking PDTs • Arbitrary number of layers: “deltas on. . ” – RID domain of child PDT = SID domain of parent PDT generalization: lo • PDThi contains all differences in time [lo, hi] PDT t 2 t 3 t 0 vs PDT t 1 are consecutive aligned t 2=t 1 t 2=t 0 “same base” PDT Table
Stacking PDTs • Arbitrary number of layers: “deltas on. . ” – RID domain of child PDT = SID domain of parent PDT generalization: lo • PDThi contains all differences in time [lo, hi] PDT t 2 t 3 t 0 vs PDT t 1 are consecutive aligned t 2=t 1 PDT t 2=t 0 “same base” overlapping [t 2, t 3] overlaps [t 0, t 1] PDT “uncomparable” / “incompatible” Table
Stacking for Isolation • ‘lock’ PDT down for further updates – Immutable read-PDT BIG: main memory resident • ‘stack’ empty PDT on top – Updateable write-PDT SMALL: L 2 cache resident – Note: PDTs are consecutive • once in a while changes are propagated – Propagate() operation • Requires consecutive PDTs TABLEx Write-PDT Propagate() Read-PDT Stable Table
Snapshot Isolation Transaction State Propagate() Trans PDT Copy Write-PDT TABLEx Write-PDT Read-PDT Stable Table • Transaction creates snapshot copy of write -PDT • Updates go into trans. PDT • On commit, Propagate() trans-PDT into write-PDT
Optimistic Concurrency Control • Two concurrent transactions Trans A Trans PDT Trans B Trans PDT Copy Write-PDT TABLEx Write-PDT Read-PDT Stable Table
Optimistic Concurrency Control • Two concurrent transactions • A commits before B Trans A Trans PDT Pr op Copy ag Write-PDT ate Trans B Trans PDT () TABLEx Write-PDT Read-PDT Stable Table Copy Write-PDT
Optimistic Concurrency Control Trans A Trans PDT Trans B Trans ) PDT e( t a ag Copy p o Pr Write-PDT TABLEx Write-PDT Read-PDT Stable Table • Two concurrent transactions • A commits before B • Can not commit B into modified write-PDT! – A changed RID enumeration
Optimistic Concurrency Control Trans PDT Trans A Trans PDT Consecutive! Trans B Trans PDT Serialize() • Two concurrent transactions • A commits before B • Can not commit B into modified write-PDT! – A changed RID enumeration • Serialize(A, B) TABLEx Write-PDT Read-PDT Stable Table – Makes aligned PDTs consecutive – MAY FAIL!! trans abort = succeeds if no conflict = write set intersection
Optimistic Concurrency Control Trans B Trans PDT Serialize() Pr o pa ga te () Trans A Trans PDT TABLEx Write-PDT Read-PDT Stable Table • Two concurrent transactions • A commits before B • Can not commit B into modified write-PDT! – A changed RID enumeration • Serialize(A, B) – Makes aligned PDTs consecutive – MAY FAIL!! trans abort = succeeds if no conflict = write set intersection Extend to any number of concurrent transactions by serializing against all PDTs of transactions that committed during its lifetime (a. k. a. backward looking OCC)
Concluding. . • PDTs speed-up differential update merging – Reduced I/O volume – Reduced CPU merge overhead • Tree structure – logarithmic lookup & maintenance of volatile RIDs – main operations: Merge(), Propagate(), Serialize() • PDTs are stackable, and capture Write-Set – Great structure for Snapshot Isolation • Formal definitions, algorithms and benchmarks in paper
Thank you!
Microbenchmarks
TPCH-30
- Management of stores
- Is an alternative of log based recovery
- Long column vs short column
- Binary table 1 to 100
- Positional and directional terms
- Simple interest per annum formula
- Which excess notation is used to represent zero as 10000
- For counting the number of positional parameters
- Benign paroxysmal positional vertigo
- Asprs accuracy standards
- Python inverted index
- Skip pointers in information retrieval
- Positional number system
- Head halter traction definition
- Evolution of the contemporary political pattern
- Positional neutralization
- Attack hit fault hand signal in volleyball
- Positional hypermetropia
- Positional notation examples
- Positional cloning
- Positional combining form meaning lateral
- Positional and directional terms
- Homicidal
- Latent hyperopia
- Anatomical position body planes
- Terra 53 cjg
- Hviewer
- Stores minerals and anchors muscles
- What stores information in a cell?
- Inputs and outputs of drainage basin