Cloud Computing CS 15 319 Dryad and Graph

  • Slides: 46
Download presentation
Cloud Computing CS 15 -319 Dryad and Graph. Lab Lecture 11, Feb 22, 2012

Cloud Computing CS 15 -319 Dryad and Graph. Lab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Today… § Last session § Pregel § Today’s session § Dryad and Graph. Lab

Today… § Last session § Pregel § Today’s session § Dryad and Graph. Lab § Announcement: § Project Phases I-A and I-B are due today © Carnegie Mellon University in Qatar 2

Objectives Discussion on Programming Models Pregel, Dryad and Graph. Lab Map. Reduce Why parallelism?

Objectives Discussion on Programming Models Pregel, Dryad and Graph. Lab Map. Reduce Why parallelism? Parallel computer architectures Traditional models of parallel programming Examples of parallel processing Message Passing Interface (MPI) Last 3 Sessions © Carnegie Mellon University in Qatar 3

Dryad © Carnegie Mellon University in Qatar 4

Dryad © Carnegie Mellon University in Qatar 4

Dryad § In this part, the following concepts of Dryad will be described: §

Dryad § In this part, the following concepts of Dryad will be described: § § Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad © Carnegie Mellon University in Qatar 5

Dryad § In this part, the following concepts of Dryad will be described: §

Dryad § In this part, the following concepts of Dryad will be described: § § Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad © Carnegie Mellon University in Qatar 6

Dryad § Dryad is a general purpose, high-performance, distributed computation engine § Dryad is

Dryad § Dryad is a general purpose, high-performance, distributed computation engine § Dryad is designed for: § § § High-throughput Data-parallel computation Use in a private datacenter § Computation is expressed as a directed-acyclic-graph (DAG) § § Vertices represent programs Edges represent data channels between vertices © Carnegie Mellon University in Qatar 7

Unix Pipes vs. Dryad DAG © Carnegie Mellon University in Qatar 8

Unix Pipes vs. Dryad DAG © Carnegie Mellon University in Qatar 8

Dryad Job Structure grep 1000 | sed 500 | sort 1000 | awk 500

Dryad Job Structure grep 1000 | sed 500 | sort 1000 | awk 500 | perl 50 Channels Input files Stage sort grep sed grep Output files awk perl sort awk sed sort Vertices (processes) © Carnegie Mellon University in Qatar 9

Dryad § In this part, the following concepts of Dryad will be described: §

Dryad § In this part, the following concepts of Dryad will be described: § § Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad © Carnegie Mellon University in Qatar 10

Dryad System Organization § There are 3 roles for machines in Dryad § §

Dryad System Organization § There are 3 roles for machines in Dryad § § § Job Manager (JM) Name Server (NS) Daemon (D) © Carnegie Mellon University in Qatar 11

Program Execution (1) § The Job Manager (JM): § Creates the job communication graph

Program Execution (1) § The Job Manager (JM): § Creates the job communication graph (job schedule) § Contacts the NS to determine the number of Ds and the topology § Assigns Vs to each D (using a simple task scheduler- not described) for execution § Coordinates data flow through the data plane § Data is distributed using a distributed storage system that shares with the Google File System some properties (e. g. , data are split into chunks and replicated across machines) § Dryad also supports the use of NTFS for accessing files locally © Carnegie Mellon University in Qatar 12

Program Execution (2) 1. Build 2. Send. exe JM code 7. Serialize vertices vertex

Program Execution (2) 1. Build 2. Send. exe JM code 7. Serialize vertices vertex code 5. Generate graph 6. Initialize vertices 3. Start JM Cluster services 8. Monitor Vertex execution 4. Query cluster resources © Carnegie Mellon University in Qatar 13

Data Channels in Dryad § § Data items can be shuffled between vertices through

Data Channels in Dryad § § Data items can be shuffled between vertices through data channels X Data channels can be: § Shared Memory FIFOs (intra-machine) § TCP Streams (inter-machine) § SMB/NTFS Local Files (temporary) § Distributed File System (persistent) § The performance and fault tolerance of these mechanisms vary § Data channels are abstracted for maximum flexibility © Carnegie Mellon University in Qatar Items M 14

Dryad § In this part, the following concepts of Dryad will be described: §

Dryad § In this part, the following concepts of Dryad will be described: § § Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad © Carnegie Mellon University in Qatar 15

Dryad Graph Description Language § Here are some operators in the Dryad graph description

Dryad Graph Description Language § Here are some operators in the Dryad graph description language: B A n n B B A A n B C D A AS = A^n (Cloning) © Carnegie Mellon University in Qatar A n n A B AS >= BS AS >> BS (B>=C) || (B>=D) (Pointwise Composition) (Bipartite Composition) (Merge) 16

Example Program in Dryad (1) § Skyserver SQL Query (Q 18): § Find all

Example Program in Dryad (1) § Skyserver SQL Query (Q 18): § Find all the objects in the database that have neighboring objects within 30 arc seconds such that at least one of the neighbors has a color similar to the primary object’s color § There are two tables involved § photo. Obj. All and it has 354, 254, 163 records § Neighbors and it has 2, 803, 165, 372 records H Y L L 4 n S U S 4 n M § For the equivalent Dryad computation, they extracted the columns of interest into two binary files, “ugriz. bin” and “neighbors. bin” © Carnegie Mellon University in Qatar n Y M D n D X n X N U N 17

Example Program in Dryad (2) [distinct] (u. color, n. neighborobjid) § Took SQL plan

Example Program in Dryad (2) [distinct] (u. color, n. neighborobjid) § Took SQL plan [merge outputs] [re-partition by n. neighborobjid] [order by n. neighborobjid] § Manually coded in Dryad select § Manually partitioned data select u. color, n. neighborobjid H Y L L u. objid from u join n from u join <temp> where = n. objid u: u. objid, = color <temp>. neighborobjid and n: |u. color objid, neighborobjid - <temp>. color| < d [partition by objid] U © Carnegie Mellon University in Qatar n Y S 4 n S M 4 n M D n D X n X N U N 18

Example Program in Dryad (3) § Here is the corresponding Dryad code: Graph. Builder

Example Program in Dryad (3) § Here is the corresponding Dryad code: Graph. Builder XSet DSet MSet SSet YSet HSet = = = H module. X^N; module. D^N; module. M^(N*4); module. S^(N*4); module. Y^N; module. H^1; Graph. Builder XInputs = (ugriz 1 >= XSet) || (neighbor >= XSet); L 4 n S Graph. Builder YTo. H = YSet >= HSet; Graph. Builder HOutputs = HSet >= output; U S 4 n M for (i = 0; i < N*4; ++i) { XTo. Y = XTo. Y || (SSet. Get. Vertex(i) >= YSet. Get. Vertex(i/4)); } © Carnegie Mellon University in Qatar Y L Graph. Builder YInputs = ugriz 2 >= YSet; Graph. Builder XTo. Y = XSet >= DSet >> MSet >= SSet; Graph. Builder final = XInputs || YInputs || XTo. Y || YTo. H || HOutputs; n Y M D n D X n X N U N 19

Dryad § In this part, the following concepts of Dryad will be described: §

Dryad § In this part, the following concepts of Dryad will be described: § § Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad © Carnegie Mellon University in Qatar 20

Fault Tolerance in Dryad (1) § Dryad is designed to handle two types of

Fault Tolerance in Dryad (1) § Dryad is designed to handle two types of failures: § § Vertex failures Channel failures § Vertex failures are handled by the JM and the failed vertex is re-executed on another machine § Channel failures cause the preceding vertex to be re-executed © Carnegie Mellon University in Qatar 21

Fault Tolerance in Dryad (2) X[0] X[1] X[3] Completed vertices X[2] Slow vertex X’[2]

Fault Tolerance in Dryad (2) X[0] X[1] X[3] Completed vertices X[2] Slow vertex X’[2] Duplicate vertex Duplication Policy = f(running times, data volumes) © Carnegie Mellon University in Qatar 22

Graph. Lab © Carnegie Mellon University in Qatar 23

Graph. Lab © Carnegie Mellon University in Qatar 23

Graph. Lab § In this part, the following concepts of Graph. Lab will be

Graph. Lab § In this part, the following concepts of Graph. Lab will be described: § § § Motivation for Graph. Lab Data Model and Update Mechanisms Scheduling in Graph. Lab Consistency Models in Graph. Lab Page. Rank in Graph. Lab © Carnegie Mellon University in Qatar 24

Graph. Lab § In this part, the following concepts of Graph. Lab will be

Graph. Lab § In this part, the following concepts of Graph. Lab will be described: § § Motivation for Graph. Lab Data Model and Update Mechanisms Scheduling in Graph. Lab Consistency Models in Graph. Lab § Page. Rank in Graph. Lab © Carnegie Mellon University in Qatar 25

Motivation for Graph. Lab § Shortcomings of Map. Reduce § § § Interdependent data

Motivation for Graph. Lab § Shortcomings of Map. Reduce § § § Interdependent data computation difficult to perform Overheads of running jobs iteratively – disk access and startup overhead Communication pattern is not user definable/flexible § Shortcomings of Pregel § § BSP model requires synchronous computation One slow machine can slow down the entire computation considerably § Shortcomings of Dryad § Very flexible but steep learning curve for the programming model © Carnegie Mellon University in Qatar 26

Graph. Lab § Graph. Lab is a framework for parallel machine learning Scheduling Data

Graph. Lab § Graph. Lab is a framework for parallel machine learning Scheduling Data Graph Shared Data Table © Carnegie Mellon University in Qatar Update Functions and Scopes 27

Graph. Lab § In this part, the following concepts of Graph. Lab will be

Graph. Lab § In this part, the following concepts of Graph. Lab will be described: § § Motivation for Graph. Lab Data Model and Update Mechanisms Scheduling in Graph. Lab Consistency Models in Graph. Lab § Page. Rank in Graph. Lab © Carnegie Mellon University in Qatar 28

Data Graph § A graph in Graph. Lab is associated with data at every

Data Graph § A graph in Graph. Lab is associated with data at every vertex and edge Data Graph § Arbitrary blocks of data can be assigned to vertices and edges © Carnegie Mellon University in Qatar 29

Update Functions § The data graph is modified using update functions § The update

Update Functions § The data graph is modified using update functions § The update function can modify a vertex v and its neighborhood, defined as the scope of v (Sv) Sv v © Carnegie Mellon University in Qatar 30

Shared Data Table § Certain algorithms require global information that is shared among all

Shared Data Table § Certain algorithms require global information that is shared among all vertices (Algorithm Parameters, Statistics, etc. ) § Graph. Lab exposes a Shared Data Table (SDT) § SDT is an associative map between keys and arbitrary blocks of data § T[Key] → Value Shared Data Table § The shared data table is updated using the sync mechanism © Carnegie Mellon University in Qatar 31

Sync Mechanism § Similar to Reduce in Map. Reduce § User can define fold,

Sync Mechanism § Similar to Reduce in Map. Reduce § User can define fold, merge and apply functions that are triggered during the global sync mechanism § Fold function allows the user to sequentially aggregate information across all vertices § Merge optionally allows user to perform a parallel tree reduction on the aggregated data collected during the fold operation § Apply function allows the user to finalize the resulting value from the fold/merge operations (such as normalization etc. ) sync © Carnegie Mellon University in Qatar Shared Data Table 32

Graph. Lab § In this part, the following concepts of Graph. Lab will be

Graph. Lab § In this part, the following concepts of Graph. Lab will be described: § § Motivation for Graph. Lab Data Model and Update Mechanisms Scheduling in Graph. Lab Consistency Models in Graph. Lab § Page. Rank in Graph. Lab © Carnegie Mellon University in Qatar 33

Scheduling in Graph. Lab (1) Scheduler § The scheduler determines the order that vertices

Scheduling in Graph. Lab (1) Scheduler § The scheduler determines the order that vertices are updated CPU 1 e b a hi h c b a f i d g j k CPU 2 The process repeats until the scheduler is empty © Carnegie Mellon University in Qatar 34

Scheduling in Graph. Lab (2) § An update schedule defines the order in which

Scheduling in Graph. Lab (2) § An update schedule defines the order in which update functions are applied to vertices § A parallel data-structure called the scheduler represents an abstract list of tasks to be executed in Graphlab § Base (Vertex) schedulers in Graph. Lab § § Synchronous scheduler Round-robin scheduler § Job Schedulers in Graph. Lab § § FIFO scheduler Priority scheduler § Custom schedulers can be defined by the set scheduler § Termination Assessment § § If the scheduler has no remaining tasks Or, a termination function can be defined to check for convergence in the data © Carnegie Mellon University in Qatar 35

Graph. Lab § In this part, the following concepts of Graph. Lab will be

Graph. Lab § In this part, the following concepts of Graph. Lab will be described: § § Motivation for Graph. Lab Data Model and Update Mechanisms Scheduling in Graph. Lab Consistency Models in Graph. Lab § Page. Rank in Graph. Lab © Carnegie Mellon University in Qatar 36

Need for Consistency Models § How much can computation overlap? © Carnegie Mellon University

Need for Consistency Models § How much can computation overlap? © Carnegie Mellon University in Qatar 37

Consistency Models in Graph. Lab § Graph. Lab guarantees sequential consistency § Guaranteed to

Consistency Models in Graph. Lab § Graph. Lab guarantees sequential consistency § Guaranteed to give the same result as a sequential execution of the computational steps § User-defined consistency models § § § Full Consistency Vertex Consistency Edge Consistency © Carnegie Mellon University in Qatar 38

Graph. Lab § In this part, the following concepts of Graph. Lab will be

Graph. Lab § In this part, the following concepts of Graph. Lab will be described: § § Motivation for Graph. Lab Data Model and Update Mechanisms Scheduling in Graph. Lab Consistency Models in Graph. Lab § Page. Rank in Graph. Lab © Carnegie Mellon University in Qatar 39

Page. Rank (1) § Page. Rank is a link analysis algorithm § The rank

Page. Rank (1) § Page. Rank is a link analysis algorithm § The rank value indicates an importance of a particular web page § A hyperlink to a page counts as a vote of support § A page that is linked to by many pages with high Page. Rank receives a high rank itself § A Page. Rank of 0. 5 means there is a 50% chance that a person clicking on a random link will be directed to the document with the 0. 5 Page. Rank © Carnegie Mellon University in Qatar 40

Page. Rank (2) § Iterate: § Where: § α is the random reset probability

Page. Rank (2) § Iterate: § Where: § α is the random reset probability § L[j] is the number of links on page j 1 2 3 4 5 6 © Carnegie Mellon University in Qatar 41

Page. Rank Example in Graph. Lab § Page. Rank algorithm is defined as a

Page. Rank Example in Graph. Lab § Page. Rank algorithm is defined as a per-vertex operation working on the scope of the vertex pagerank(i, scope){ // Get Neighborhood data (R[i], Wij, R[j]) scope; // Update the vertex data // Reschedule Neighbors if needed if R[i] changes then reschedule_neighbors_of(i); } Dynamic computation © Carnegie Mellon University in Qatar 42

How Map. Reduce, Pregel, Dryad and Graph. Lab Compare Against Each Other? © Carnegie

How Map. Reduce, Pregel, Dryad and Graph. Lab Compare Against Each Other? © Carnegie Mellon University in Qatar 43

Comparison of the Programming Models Map. Reduce Pregel Dryad Graph. Lab Programming Model Fixed

Comparison of the Programming Models Map. Reduce Pregel Dryad Graph. Lab Programming Model Fixed Functions – Map and Reduce Supersteps over a data graph with messages passed DAG with program vertices and data edges Data graph with shared data table and update functions Parallelism Concurrent execution of tasks within map and reduce phases Concurrent execution of user functions over of vertices during a vertices within a stage superstep Concurrent execution of non-overlapping scopes, defined by consistency model Data Handling Distributed file system Flexible data channels: Memory, Files, DFS etc. Undefined – Graphs can be in memory or on disk Task Scheduling Fixed Phases – HDFS Locality based map task assignment Partitioned Graph and Inputs assigned by assignment functions Job and Stage Managers assign vertices to avaiabled vertices to available daemons Pluggable schedulers to schedule update functions Fault Tolerance DFS replication + Task reassignment / Speculative execution of Tasks Checkpointing and superstep reexecution Vertex and Edge failure recovery Synchronous and asychronous snapshots Developed by Google Microsoft Carnegie Mellon © Carnegie Mellon University in Qatar 44

References § This presentation has elements borrowed from various papers and presentations: § Papers:

References § This presentation has elements borrowed from various papers and presentations: § Papers: § § Pregel: http: //kowshik. github. com/JPregel/pregel_paper. pdf Dryad: http: //research. microsoft. com/pubs/63785/eurosys 07. pdf Graph. Lab: http: //www. select. cs. cmu. edu/publications/paperdir/uai 2010 -low-gonzalez-kyrolabickson-guestrin-hellerstein. pdf Presentations: § § § Dryad Presentation at Berkeley by M. Budiu: http: //budiu. info/work/dryad-talk-berkeley 09. pptx Graph. Lab 1 Presentation: http: //graphlab. org/uai 2010_graphlab. pptx Graph. Lab 2 Presentation: http: //graphlab. org/presentations/nips-biglearn-2011. pptx © Carnegie Mellon University in Qatar 45

Next Class Distributed File Systems © Carnegie Mellon University in Qatar 46

Next Class Distributed File Systems © Carnegie Mellon University in Qatar 46