Naiad A Timely Dataflow System Derek G Murray

  • Slides: 39
Download presentation
Naiad: A Timely Dataflow System Derek G. Murray Michael Isard Frank Mc. Sherry Paul

Naiad: A Timely Dataflow System Derek G. Murray Michael Isard Frank Mc. Sherry Paul Barham Rebecca Isaacs Martín Abadi Microsoft Research 1

Batch processing Stream processing Timely dataflow Graph processing

Batch processing Stream processing Timely dataflow Graph processing

< 1 s batch updates #x In ⋈ @y < 1 ms iterations ⋈

< 1 s batch updates #x In ⋈ @y < 1 ms iterations ⋈ z? < 100 ms interactive queries max ⋈

Outline Revisiting dataflow How to achieve low latency Evaluation

Outline Revisiting dataflow How to achieve low latency Evaluation

Dataflow Stage Connector

Dataflow Stage Connector

Dataflow: parallelism Vertex B C Edge

Dataflow: parallelism Vertex B C Edge

Dataflow: iteration

Dataflow: iteration

Batching (synchronous) Requires coordination ü Supports aggregation vs. Streaming (asynchronous) ü No coordination needed

Batching (synchronous) Requires coordination ü Supports aggregation vs. Streaming (asynchronous) ü No coordination needed Aggregation is difficult

Batch iteration

Batch iteration

Streaming iteration

Streaming iteration

Timely dataflow – timestamp Supports asynchronous and fine-grained synchronous execution

Timely dataflow – timestamp Supports asynchronous and fine-grained synchronous execution

How to achieve low latency Programming model Distributed progress tracking protocol System performance engineering

How to achieve low latency Programming model Distributed progress tracking protocol System performance engineering

Programming model 2× B C. OPERATION(x, y, z) C 2× C. ONCALLBACK(u, v) D

Programming model 2× B C. OPERATION(x, y, z) C 2× C. ONCALLBACK(u, v) D

Messages B. SENDBY(edge, message, time) B C D C. ONRECV(edge, message, time) Messages are

Messages B. SENDBY(edge, message, time) B C D C. ONRECV(edge, message, time) Messages are delivered asynchronously

Notifications C. SENDBY(_, _, time) B C D. NOTIFYAT(time) D No more messages at

Notifications C. SENDBY(_, _, time) B C D. NOTIFYAT(time) D No more messages at time or earlier D. ONRECV(_, D. O _, time) NNOTIFY(time) Notifications support batching

Programming frameworks input. Select. Many(x => x. Split()). Where(x => x. Starts. With("#")). Count(x

Programming frameworks input. Select. Many(x => x. Split()). Where(x => x. Starts. With("#")). Count(x => x); LINQ Graph. LINQ Frameworks All. Reduce Differential dataflow BLOOM BSP (Pregel) Timely dataflow API Distributed runtime

How to achieve low latency Programming model Asynchronous and fine-grained synchronous execution Distributed progress

How to achieve low latency Programming model Asynchronous and fine-grained synchronous execution Distributed progress tracking protocol System performance engineering

How to achieve low latency Programming model Asynchronous and fine-grained synchronous execution Distributed progress

How to achieve low latency Programming model Asynchronous and fine-grained synchronous execution Distributed progress tracking protocol System performance engineering

Progress tracking Epoch t is complete A B E. NOTIFYAT(t) C D E C.

Progress tracking Epoch t is complete A B E. NOTIFYAT(t) C D E C. ONRECV(_, _, t) C. SENDBY(_, _, tʹ) tʹ ≥ t

Progress tracking C. NOTIFYAT(t) B C A E D Problem: C depends on its

Progress tracking C. NOTIFYAT(t) B C A E D Problem: C depends on its own output

B. SENDBY(_, _, (1, 7)) C. NOTIFYA ATT((1, (t) 6)) A. SENDBY(_, _, 1)

B. SENDBY(_, _, (1, 7)) C. NOTIFYA ATT((1, (t) 6)) A. SENDBY(_, _, 1) E. NOTIFYAT(? ) (1) B C A Advances timestamp Advances loop counter E F D D. SENDBY(1, 6) Solution: structured timestamps in loops

Graph structure leads to an order on events (1, 6) ⊤ (1, 6) B

Graph structure leads to an order on events (1, 6) ⊤ (1, 6) B C A 1 E F D (1, 5) (1, 6)

Graph structure leads to an order on events (1, 6) 1. Maintain the set

Graph structure leads to an order on events (1, 6) 1. Maintain the set of outstanding events ⊤ B C 2. Sort events by could-result-in (partial) order A 3. Deliver notifications in the frontier of the set F D ONNOTIFY(t) is called after all calls (1, 5) to ONRECV(_, _, t) (1, 6) 1 E

E. NOTIFYAT(1) C. SENDBY(_, _, (1, 5)) D. ONRECV(_, _, (1, 5)) E. ONNOTIFY(1)

E. NOTIFYAT(1) C. SENDBY(_, _, (1, 5)) D. ONRECV(_, _, (1, 5)) E. ONNOTIFY(1) Optimizations make doing this practical

How to achieve low latency Programming model Asynchronous and fine-grained synchronous execution Distributed progress

How to achieve low latency Programming model Asynchronous and fine-grained synchronous execution Distributed progress tracking protocol Enables processes to deliver notifications promptly System performance engineering

How to achieve low latency Programming model Asynchronous and fine-grained synchronous execution Distributed progress

How to achieve low latency Programming model Asynchronous and fine-grained synchronous execution Distributed progress tracking protocol Enables processes to deliver notifications promptly System performance engineering

Performance engineering Microstragglers are the primary challenge Garbage collection TCP timeouts Data structure contention

Performance engineering Microstragglers are the primary challenge Garbage collection TCP timeouts Data structure contention O(1– 10 s) O(10– 100 ms) O(1 ms) For detail on how we handled these, see paper (Sec. 3)

How to achieve low latency Programming model Asynchronous and fine-grained synchronous execution Distributed progress

How to achieve low latency Programming model Asynchronous and fine-grained synchronous execution Distributed progress tracking protocol Enables processes to deliver notifications promptly System performance engineering Mitigates the effect of microstragglers

Outline Revisiting dataflow How to achieve low latency Evaluation

Outline Revisiting dataflow How to achieve low latency Evaluation

64 8 -core 2. 1 GHz AMD Opteron 16 GB RAM per server Gigabit

64 8 -core 2. 1 GHz AMD Opteron 16 GB RAM per server Gigabit Ethernet System design Data S S S Progress tracker S Control S S S Progress tracker Limitation: Fault tolerance via checkpointing/logging (see paper) S

Iteration latency 95 th percentile: 2. 2 ms 2. 5 Iteration latency (ms) 64

Iteration latency 95 th percentile: 2. 2 ms 2. 5 Iteration latency (ms) 64 8 -core 2. 1 GHz AMD Opteron 16 GB RAM per server Gigabit Ethernet 2 1. 5 1 0. 5 0 Median: 750 μs 0 10 20 30 40 Number of computers 50 60 70

Page. Rank Word count LINQ Iterative machine learning Applications Interactive graph analysis Graph. LINQ

Page. Rank Word count LINQ Iterative machine learning Applications Interactive graph analysis Graph. LINQ Frameworks All. Reduce Differential dataflow BLOOM BSP (Pregel) Timely dataflow API Distributed runtime

Twitter graph 42 million nodes 1. 5 billion edges Page. Rank 64 8 -core

Twitter graph 42 million nodes 1. 5 billion edges Page. Rank 64 8 -core 2. 1 GHz AMD Opteron 16 GB RAM per server Gigabit Ethernet Iteration length (s) 100 Pregel (Naiad) 10 Graph. LINQ GAS (Power. Graph) GAS (Naiad) 1 0 10 20 30 40 Number of computers 50 60 70

Interactive graph analysis #x 32 K tweets/s In 10 queries/s @y ⋈ z? ⋈

Interactive graph analysis #x 32 K tweets/s In 10 queries/s @y ⋈ z? ⋈ max ⋈

Query latency Max: 99 th percentile: Median: 1000 Query latency (ms) 32 8 -core

Query latency Max: 99 th percentile: Median: 1000 Query latency (ms) 32 8 -core 2. 1 GHz AMD Opteron 16 GB RAM per server Gigabit Ethernet 140 ms 70 ms 5. 2 ms 100 10 1 30 32 34 36 38 40 42 Experiment time (s) 44 46 48 50

Conclusions Low-latency distributed computation enables Naiad to: • achieve the performance of specialized frameworks

Conclusions Low-latency distributed computation enables Naiad to: • achieve the performance of specialized frameworks • provide the flexibility of a generic framework The timely dataflow API enables parallel innovation Now available for download: http: //github. com/Microsoft. Research. SVC/naiad/

For more information Visit the project website and blog http: //research. microsoft. com/naiad/ http:

For more information Visit the project website and blog http: //research. microsoft. com/naiad/ http: //bigdataatsvc. wordpress. com/ Now available for download: http: //github. com/Microsoft. Research. SVC/naiad/

Naiad Now available for download: http: //github. com/Microsoft. Research. SVC/naiad

Naiad Now available for download: http: //github. com/Microsoft. Research. SVC/naiad