CSCE 513 Computer Architecture Lecture 18 Warehouse Scale

















- Slides: 17
CSCE 513 Computer Architecture Lecture 18 Warehouse Scale Computing Topics n NUMA n Distributed vs Multiprocessos Cache Coherency Snoopy MPI MP n n Readings: November 29, 2017 n Chapter 5
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism: Copyright © 2012, Elsevier Inc. All rights reserved. 2
Introduction Warehouse-scale computer (WSC) n Provides Internet services l Search, social networking, online maps, video sharing, online shopping, email, cloud computing, etc. n Differences with HPC “clusters”: l Clusters have higher performance processors and network l Clusters emphasize thread-level parallelism, WSCs emphasize request-level parallelism n Differences with datacenters: l Datacenters consolidate different machines and software into one location l Datacenters emphasize virtual machines and hardware heterogeneity in order to serve varied customers – 3– Copyright © 2012, Elsevier Inc. All rights reserved. CSCE 513 Fall 2016
Introduction Important design factors for WSC: n Cost-performance l Small savings add up n Energy efficiency l Affects power distribution and cooling l Work per joule n n Dependability via redundancy Network I/O Interactive and batch processing workloads Ample computational parallelism is not important l Most jobs are totally independent l “Request-level parallelism” n Operational costs count l Power consumption is a primary, not secondary, constraint when designing system n Scale and its opportunities and problems l Can afford to build customized systems since WSC require volume purchase – 4– Copyright © 2012, Elsevier Inc. All rights reserved. CSCE 513 Fall 2016
Batch processing framework: Map. Reduce n Map: applies a programmer-supplied function to each logical input record l Runs on thousands of computers l Provides new set of key-value pairs as intermediate values n – 5– Programming Models and Workloads for WSCs Prgrm’g Models and Workloads Reduce: collapses values using another programmersupplied function Copyright © 2012, Elsevier Inc. All rights reserved. CSCE 513 Fall 2016
Example: n map (String key, String value): l // key: document name l // value: document contents l for each word w in value » Emit. Intermediate(w, ” 1”); // Produce list of all words n Programming Models and Workloads for WSCs Prgrm’g Models and Workloads reduce (String key, Iterator values): // key: a word // value: a list of counts int result = 0; for each v in values: » result += Parse. Int(v); // get integer from key-value pair l Emit(As. String(result)); l l – 6– Copyright © 2012, Elsevier Inc. All rights reserved. CSCE 513 Fall 2016
Map. Reduce runtime environment schedules map and reduce task to WSC nodes Availability: n n Use replicas of data across different servers Use relaxed consistency: Programming Models and Workloads for WSCs Prgrm’g Models and Workloads l No need for all replicas to always agree Workload demands n – 7– Often vary considerably Copyright © 2012, Elsevier Inc. All rights reserved. CSCE 513 Fall 2016
WSC often use a hierarchy of networks for interconnection Each 19” rack holds 48 1 U servers connected to a rack switch Computer Ar 4 chitecture of WSC Computer Architecture of WSC Rack switches are uplinked to switch higher in hierarchy n Uplink has 48 / n times lower bandwidth, where n = # of uplink ports l “Oversubscription” n – 8– Goal is to maximize locality of communication relative to the rack Copyright © 2012, Elsevier Inc. All rights reserved. CSCE 513 Fall 2016
Computer Ar 4 chitecture of WSC Storage options: n n – 9– Use disks inside the servers, or Network attached storage through Infiniband WSCs generally rely on local disks Google File System (GFS) uses local disks and maintains at least three relicas Copyright © 2012, Elsevier Inc. All rights reserved. CSCE 513 Fall 2016
Switch that connects an array of racks n n n – 10 – Array switch should have 10 X the bisection bandwidth of rack switch Cost of n-port switch grows as n 2 Often utilize content addressible memory chips and FPGAs Copyright © 2012, Elsevier Inc. All rights reserved. Computer Ar 4 chitecture of WSC Array Switch CSCE 513 Fall 2016
Servers can access DRAM and disks on other servers using a NUMA-style interface – 11 – Copyright © 2012, Elsevier Inc. All rights reserved. Computer Ar 4 chitecture of WSC Memory Hierarchy CSCE 513 Fall 2016
Location of WSC n Proximity to Internet backbones, electricity cost, property tax rates, low risk from earthquakes, floods, and hurricanes Power distribution – 12 – Copyright © 2012, Elsevier Inc. All rights reserved. Physcical Infrastrcuture and Costs of WSC Infrastructure and Costs of WSC CSCE 513 Fall 2016
Cooling n n Air conditioning used to cool server room 64 F – 71 F l Keep temperature higher (closer to 71 F) n Cooling towers can also be used l Minimum temperature is “wet bulb temperature” – 13 – Copyright © 2012, Elsevier Inc. All rights reserved. Physcical Infrastrcuture and Costs of WSC Infrastructure and Costs of WSC CSCE 513 Fall 2016
Cooling system also uses water (evaporation and spills) n E. g. 70, 000 to 200, 000 gallons per day for an 8 MW facility Power cost breakdown: n n Chillers: 30 -50% of the power used by the IT equipment Air conditioning: 10 -20% of the IT power, mostly due to fans Physcical Infrastrcuture and Costs of WSC Infrastructure and Costs of WSC How man servers can a WSC support? n Each server: l “Nameplate power rating” gives maximum power consumption l To get actual, measure power under actual workloads n – 14 – Oversubscribe cumulative server power by 40%, but monitor power closely Copyright © 2012, Elsevier Inc. All rights reserved. CSCE 513 Fall 2016
Power Utilization Effectiveness (PEU) n n = Total facility power / IT equipment power Median PUE on 2006 study was 1. 69 Performance n n n Physcical Infrastrcuture and Costs of WSC Measuring Efficiency of a WSC Latency is important metric because it is seen by users Bing study: users will use search less as response time increases Service Level Objectives (SLOs)/Service Level Agreements (SLAs) l E. g. 99% of requests be below 100 ms – 15 – Copyright © 2012, Elsevier Inc. All rights reserved. CSCE 513 Fall 2016
Capital expenditures (CAPEX) n Cost to build a WSC Operational expenditures (OPEX) n – 16 – Cost to operate a WSC Copyright © 2012, Elsevier Inc. All rights reserved. Physcical Infrastrcuture and Costs of WSC Cost of a WSC CSCE 513 Fall 2016
WSCs offer economies of scale that cannot be achieved with a datacenter: n n Cloud Computing 5. 7 times reduction in storage costs 7. 1 times reduction in administrative costs 7. 3 times reduction in networking costs This has given rise to cloud services such as Amazon Web Services l “Utility Computing” l Based on using open source virtual machine and operating system software – 17 – Copyright © 2012, Elsevier Inc. All rights reserved. CSCE 513 Fall 2016