Making Fast Databases FAS TER andypavlo Fast Cheap
Making Fast Databases FAS TER @andy_pavlo
Fast + Cheap
Legacy Systems TPC-C New. Order 12. 3% 29. 6% CPU Cycles 10. 2% 18. 7% 21. 1% Real Work Buffer Pool Latching 8. 1% OLTP Through the Looking Gla and What We Found There SIGMOD 2008
OLTP Transactions Fast Repetitive Small
Memory • Parallel • Shared-N Transaction Processing H-Store: A High-Performance, Distribu Main Memory Transaction Processing VLDB vol. 1, issue 2, 2008
Procedure Stored Transaction Name Procedure Input Execution Result Parameters Client Application Database Cluster
txn/s TPC-C New. Order No Distributed Txns Partitions
Optimization #1: Partition database to reduce the number of Skew-Aware Automatic Database Partit in Shared-Nothing, Parallel OLTP Syste SIGMOD 2012
c_id CUSTOMER ITEM ORDERS c_w_i c_last i_id …c_w_i i_na c_last i_pric o_id c_id … d me e d 1001 5 1002 3 1003 12 1004 5 1005 6 1006 7 CUSTO MER ORDER S ITEM RZA 60351 1001 - XXX 5 4 GZA 1002 - 3 26792 Raekw 1003 - XXX 12 3 on 47538 Deck 1004 - XXX 5 6 Killah 1005 - 6 57894 XXX ODB 1006 7 5 47634 8 78428 5 XXX 23. 99 RZA o_c_i o_w_i d d … 78703 - 1004 5 - GZA 78704 19. 99 78705 -Raekw on 78706 14. 99 -Deck 78707 Killah 78708 9. 98 ODB - 1002 3 - 1006 7 - 1005 6 - 1003 12 - 103. 49 CUSTO XXX 69. 99 MER ORDER S ITEM - CUSTO MER ORDER S ITEM
? ? ? New. Order(5, “Method Man”, 1234) Client Application ? CUSTO MER ORDER S ITEM
DDL D DL Sche CUSTOMER -----ma ------ ITEM Workl oad ORDERS CUSTO MER ORDER S ITEM e d r O SELECT * FROM w e N WAREHOUSE SELECT * FROM r SELECT * FROM WAREHOUSE WHERE W_ID = 10; SELECT * FROM 10; SELECT *WHERE FROM DISTRICT INSERT INTO DISTRICT D_W_ID INSERT = 10 INTO AND ORDERS = 10 AND D_ID =9; D_ID ORDERS =9; (O_W_ID, O_D_ID, INSERT INTO O_C_ID) VALUES ORDERS (10, 9, 12345); ORDERS CUSTO (O_W_ID, (10, 9, CUSTO 12345); O_D_ID, MER (O_W_ID, O_D_ID, ⋮ ORDER O_C_ID) VALUES S O_C_ID, …) S ⋮ ITEM (10, 9, 12345); ITEM VALUES Large-Neighorhood Search Algorithm CUSTO MER ORDER S ITEM …
D DL Sche ma Large. Neighborhood Search Restart -------- Workl oad nitial Design. Relaxation. Local Search
Throughput Horticulture State-of-the-Art (txn/s) TATP +88% TPC-C Ske +16% +183%
% Single-Partitioned Transactions Search Times TATP SEATS TPC-C Auction. Mark TPC-C Skewed TPC-E
? Undo Log Client Application ? ? Database Cluster ?
Optimization #2: Predict what txns will do before they execute. On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP VLDB, vol 5. issue 2, October 2011
» Partitions Touched? » Undo Log? » Done with Partitions? Client Application Database Cluster
Current State: Input Parameters: ? ? w_id=0 i_w_ids=[0, 1] i_ids=[1001, 1002] Get. Warehouse: SELECT * FROM WAREHOUSE WHERE W_ID = ?
Estimated Execution Path Input Parameters: w_id=0 i_w_ids=[0, 1] i_ids=[1001, 1002] Transaction Estimate Confidence Coefficient: Best Partition: Partitions Accessed: Use Undo Logging: 0. 96 0 {0} Yes
Throughput (txn/s) Houdini. Assume Single-Partitioned TATP +57% TPC-C Auction. M +126% +117%
Prediction Overhead TATP TPC-C Auction. Ma
Conclusion: Achieving fast performa more than just using on Future Work: Reduce distributed txn o through creative schedu
hhstore. cs. bro wn. edu store github. com/apavlo/h-stor
Help is Ava +1 -212 -939 -706 Graduate Student Abuse Hotline Available 24/7
- Slides: 29