To Share or Not to Share Ryan Johnson
- Slides: 28
To Share or Not to Share? Ryan Johnson Nikos Hardavellas, Ippokratis Pandis, Naju Mancheril, Stavros Harizopoulos**, Kivanc Sabirli, Anastasia Ailamaki, Babak Falsafi PARALLEL DATA LABORATORY Carnegie Mellon University **HP LABS
Motivation For Work Sharing Query: What is the average GPA in the ECE dept. ? outpu t aggregate join scan Dept http: //www. pdl. cmu. edu/ Query: What is the highest undergraduate GPA? scan Student 2 Ryan Johnson ©
Motivation For Work Sharing • Many queries in system outpu • Similar requests t t • Redundant work aggregate • Work Sharing • Detect redundant work • Compute results once and share join scan Dept • Big win for I/O, uniprocessors scan Student • 2 x speedup for TPC-H queries [hariz 05] http: //www. pdl. cmu. edu/ 3 Ryan Johnson ©
Work Sharing on Modern Hardware Speedup due to WS 2. 0 1 CPU 8 CPU 1. 5 1. 0 7 x core L 1 L 2 L 2 0. 5 0. 0 0 Memory 15 30 45 Shared Queries • Work sharing can hurt performance! http: //www. pdl. cmu. edu/ 4 Ryan Johnson ©
Contributions • Observation • Work sharing can hurt performance on parallel hardware • Analysis • Develop intuitive analytical model of work sharing • Identify trade-off between total work, critical path • Application • Model-based policy outperforms static ones by up to 6 x http: //www. pdl. cmu. edu/ 5 Ryan Johnson ©
Outline • Introduction • Part I: Intuition and Model • Part II: Analysis and Experiments http: //www. pdl. cmu. edu/ 6 Ryan Johnson ©
Challenges of Exploiting Work Sharing • Independent execution only? • Load reduction from work sharing can be useful • Work sharing only? • Indiscriminate application can hurt performance • To share or not to share? • System and workload dependent • Adapt decisions at runtime • Must understand work sharing to exploit it fully http: //www. pdl. cmu. edu/ 7 Ryan Johnson ©
Work Sharing vs. Parallelism Query 1 P = 4. 33 Query 1 response time Critical Paths Query 2 Aggregat e Join Query 2 response time http: //www. pdl. cmu. edu/ Independent Execution 8 Scan Ryan Johnson ©
Work Sharing vs. Parallelism Query 1 P = 4. 33 P = 2. 75 Query 1 response time Critical path now longer Penalty Query 2 Aggregat e Join Query 2 response time Scan • Total work and critical path both important Shared Execution http: //www. pdl. cmu. edu/ 9 Ryan Johnson ©
Understanding Work Sharing • Performance depends on two factors: • Work sharing presents a trade-off • Reduces total work • Potentially lengthens critical path • Balance both factors or performance suffers http: //www. pdl. cmu. edu/ 10 Ryan Johnson ©
Basis for a Model • “Closed” system • • Consistent high load Throughput computing Assumed in most benchmarks Fixed number of clients • Little’s Law governs throughput • Higher response time = lower throughput • Total work not a direct factor! • Load reduction secondary to response time http: //www. pdl. cmu. edu/ 11 Ryan Johnson ©
Predicting Response Time • Case 1: Compute-bound • Case 2: Critical path-bound • Larger bottleneck determines response time • Model provides u and pmax http: //www. pdl. cmu. edu/ 12 Ryan Johnson ©
An Analytical Model of Work Sharing Throughput for m queries and n processors U = requested utilization Improved Pmax = longest pipe stage by Potentially work sharing worsened by work sharing • Sharing helpful when Xshared > Xalone http: //www. pdl. cmu. edu/ 13 Ryan Johnson ©
Outline • Introduction • Part I: Intuition and Model • Part II: Analysis and Experiments http: //www. pdl. cmu. edu/ 14 Ryan Johnson ©
Experimental Setup • Hardware • Sun T 2000 “Niagara” with 16 GB RAM • 8 cores (32 threads) • Solaris processor sets vary effective CPU count • Cordoba • Staged DBMS • Naturally exposes work sharing • Flexible work sharing policies • 1 GB TPCH dataset • Fixed Client and CPU counts per run http: //www. pdl. cmu. edu/ 15 Ryan Johnson ©
Model Validation: TPCH Q 1 Predicted vs. Measured Performance Speedup due to WS 1. 4 1. 2 1 CPU model 1 2 CPU model 0. 8 0. 6 8 CPU model 0. 4 32 CPU model 0. 2 0 0 15 • Avg/max error: http: //www. pdl. cmu. edu/ 30 45 5. 7% / 22% Shared Queries 16 Ryan Johnson ©
Model Validation: TPCH Q 4 • Behavior varies with both system and workload http: //www. pdl. cmu. edu/ 17 Ryan Johnson ©
Exploring WS vs. Parallelism • Work sharing splits query into three parts Example: • Independent work – Per-query, parallel – Total work • Serial work – Per-query, serial – Critical path • Shared work – Computed once – “Free” after first query http: //www. pdl. cmu. edu/ Independent 37%Serial 4% Shared - 59% 18 Ryan Johnson ©
Benefit from Work Sharing Exploring WS vs. Parallelism Potential Speedup 2. 5 CPUs 2 4 8 16 32 1. 5 1 0. 5 0 0 • Behavior http: //www. pdl. cmu. edu/ 8 16 24 matches. Shared previously published Queries 19 32 results Ryan Johnson ©
Benefit from Work Sharing Exploring WS vs. Parallelism Potential Speedup 2. 5 CPUs 2 4 8 1. 5 1 0. 5 Saturated 0 0 http: //www. pdl. cmu. edu/ 8 16 Shared Queries 20 24 32 Ryan Johnson ©
Benefit from Work Sharing Exploring WS vs. Parallelism 2. 5 CPUs 2 4 8 16 32 Potential Speedup 1. 5 1 0. 5 Saturated 0 0 http: //www. pdl. cmu. edu/ 8 16 Shared Queries 21 24 32 Ryan Johnson ©
Benefit from Work Sharing Exploring WS vs. Parallelism 2. 5 CPUs 2 4 8 16 32 1. 5 1 Potential Speedup 0. 5 Saturated 0 0 8 16 24 32 • More processors shift bottleneck Shared Queries to critical path http: //www. pdl. cmu. edu/ 22 Ryan Johnson ©
Benefit from Work Sharing Performance Impact of Serial Work (32 CPU) 2. 5 0% 1% 2% 7% 2 1. 5 1 0. 5 0 0 10 20 Shared Queries 30 40 • Critical path quickly becomes major bottleneck http: //www. pdl. cmu. edu/ 23 Ryan Johnson ©
Model-guided Work Sharing • Integrate predictive model into Cordoba • Predict benefit of work sharing for each new query • Consider multiple groups of queries at once • Shorter critical path, increased parallelism • Experimental setup • Profile run with 2 clients, 2 CPUs • Extract model parameters with profiling tools • 20 clients submit mix of TPCH Q 1 and Q 4 • Compare against always-, never-share policies http: //www. pdl. cmu. edu/ 24 Ryan Johnson ©
Comparison of Work Sharing Strategies 2 CPU 250 always share model guided never share 150 Queries/min 200 100 50 32 CPU 200 150 100 50 0 0 All Q 1 50/50 All Q 4 All Q 1 Query Ratio 50/50 All Q 4 Query Ratio • Model-based policy balances critical path and load http: //www. pdl. cmu. edu/ 25 Ryan Johnson ©
Related Work • Many existing work sharing schemes • Identification occurs at different stages in the query’s lifetime • All allow pipelined query execution Multiple Synchroniz Query Materialized Staged ed Optimizatio Views DBMS Scanning [rouss 82] n [hariz 05] [lang 07] [roy 00] Early Schema Query Buffer Pool Late design compilation execution Access • Model describes all types of work sharing http: //www. pdl. cmu. edu/ 26 Ryan Johnson ©
Conclusions • Work sharing can hurt performance • Highly parallel, memory resident machines • Intuitive analytical model captures behavior • Trade-off between load reduction and critical path • Model-guided work sharing highly effective • Outperforms static policies by up to 6 x http: //www. cs. cmu. edu/~Staged. DB/ http: //www. pdl. cmu. edu/ 27 Ryan Johnson ©
References • [hariz 05] S. Harizopoulos, V. Shkapenyuk, and A. Ailamaki. “QPipe: A Simultaneously Pipelined Relational Query Engine. ” In Proc. SIGMOD, 2005. • [lang 07] C. Lang, B. Bhattacharjee, T. Malkemus, S. Padmanabhan, and K. Wong. “Increasing Buffer-Locality for Multiple Relational Table Scans through Grouping and Throttling. ” In Proc. ICDE, 2007. • [rouss 82] N. Roussopoulos. “View Indexing in Relational databases. ” In ACM TODS, 7(2): 258 -290, 1982. • [roy 00] P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. “Efficient and Extensible Algorithms for Multi Query Optimization. ” In Proc. SIGMOD, 2000. http: //www. cs. cmu. edu/~Staged. DB/ http: //www. pdl. cmu. edu/ 28 Ryan Johnson ©
- Understanding the mirai botnet
- Johnson and johnson md&d
- Laurie johnson brad johnson
- Swot analysis johnson and johnson
- Credo johnson
- Jjeds jnj directory
- Johnson background
- Johnson and johnson organizational structure
- Johnson and johnson three c's of classroom management
- Johnson and johnson bcg matrix
- Vocabulary workshop level d unit 1
- Informal-casual
- Love is not all imagery
- We will not be moved you're standing with us
- Attention is not not explanation
- Ears that hear and eyes that see
- Not a rustling leaf not a bird
- Negation of if
- Being too broad
- P ran
- If you are not confused you're not paying attention
- You cannot not communicate
- Just right scale
- Quotes about measurement and improvement
- Ryan nipp
- Ryan wang hsbc
- Ryan diviney
- Ryan hendrixson
- Ryan pandey