EECS 582 Final Review Mosharaf Chowdhury EECS 582

  • Slides: 16
Download presentation
EECS 582 Final Review Mosharaf Chowdhury EECS 582 – F 16 1

EECS 582 Final Review Mosharaf Chowdhury EECS 582 – F 16 1

Stats on the 11 Papers We’ve Reviewed 3, 5 3 Score 2, 5 2

Stats on the 11 Papers We’ve Reviewed 3, 5 3 Score 2, 5 2 1, 5 1 0, 5 EECS 582 – F 16 s ry Va re e t-T -C EC Fa ac he an M C PA G FS F R D eg a m O es os M Bo rg k Sp ar M ap R ed uc e 0 2

EECS 582 – F 16 s ry Va re e t-T Fa ac he

EECS 582 – F 16 s ry Va re e t-T Fa ac he an M C -C EC PA G FS F R D eg a m O es os M Bo rg k Sp ar ed uc e R M ap Score Programming Models 3, 5 3 2, 5 2 1, 5 1 0, 5 0 3

Programming Models • Map. Reduce • Exposes scalability and fault tolerance with little programming

Programming Models • Map. Reduce • Exposes scalability and fault tolerance with little programming experience • Doesn’t work for well for iterative algorithms • Spark • RDDs suits iterative workloads well • Lineage for fault tolerance allows avoiding checkpointing • Ease of usability EECS 582 – F 16 4

EECS 582 – F 16 s ry Va re e t-T Fa ac he

EECS 582 – F 16 s ry Va re e t-T Fa ac he an M C -C EC PA G FS F R D eg a m O es os M Bo rg k Sp ar ed uc e R M ap Score Operating Systems 3, 5 3 2, 5 2 1, 5 1 0, 5 0 5

Operating Systems • Borg • Classifies jobs into long and short with different priorities,

Operating Systems • Borg • Classifies jobs into long and short with different priorities, preempting if required • Hides details of allocation and failures from programmers • Centralized schedulers can be scalable • Mesos • Two-level scheduling with resource offers • Frameworks can choose to accept or reject offers • Failure handling is left to the apps EECS 582 – F 16 6

EECS 582 – F 16 s ry Va re e t-T Fa ac he

EECS 582 – F 16 s ry Va re e t-T Fa ac he an M C -C EC PA G FS F R D eg a m O es os M Bo rg k Sp ar ed uc e R M ap Score Resource Allocation 3, 5 3 2, 5 2 1, 5 1 0, 5 0 7

Resource Allocation • Omega • Schedule mix of batch and interactive jobs with good

Resource Allocation • Omega • Schedule mix of batch and interactive jobs with good placement • Optimistic concurrency control targeted toward larger clusters • Shared-state scheduler • DRF • Generalization of max-min allocation to multiple resources and heterogeneous clusters • Many properties to maximize utilization and fairness without cheating EECS 582 – F 16 8

EECS 582 – F 16 s ry Va re e t-T Fa ac he

EECS 582 – F 16 s ry Va re e t-T Fa ac he an M C -C EC PA G FS F R D eg a m O es os M Bo rg k Sp ar ed uc e R M ap Score File System 3, 5 3 2, 5 2 1, 5 1 0, 5 0 9

File System • GFS • Workload-guided design: appends and large reads with small number

File System • GFS • Workload-guided design: appends and large reads with small number of huge files • Centralized design with replication for fault tolerance • FDS • Data and compute are NOT collocated • Exploits full bisection bandwidth networks • Stores everything has blobs to maximize sequential I/O EECS 582 – F 16 10

EECS 582 – F 16 s ry Va re e t-T Fa ac he

EECS 582 – F 16 s ry Va re e t-T Fa ac he an M C -C EC PA G FS F R D eg a m O es os M Bo rg k Sp ar ed uc e R M ap Score Memory Management 3, 5 3 2, 5 2 1, 5 1 0, 5 0 11

Memory Management • PACMan • Coordinated caching for DFSes • All-or-nothing property dictates two

Memory Management • PACMan • Coordinated caching for DFSes • All-or-nothing property dictates two eviction policies • Prefers small jobs • EC-Cache • Alternative to replication that erasure codes instead • Improves performance and tail latency by exploiting parallel I/O and better load balancing by splitting individual objects EECS 582 – F 16 12

EECS 582 – F 16 s ry Va re e t-T Fa ac he

EECS 582 – F 16 s ry Va re e t-T Fa ac he an M C -C EC PA G FS F R D eg a m O es os M Bo rg k Sp ar ed uc e R M ap Score Networking 3, 5 3 2, 5 2 1, 5 1 0, 5 0 13

Networking • Fat Tree • DC network topology to provide full bisection bandwidth by

Networking • Fat Tree • DC network topology to provide full bisection bandwidth by arranging commodity switches into multiple stages • Approximates Clos topology • Global scheduling to minimize congestions (Hedera) • Varys • Coflow abstraction to exploit application-level algorithm • Heuristics to improve order and allocate rates using all-or-nothing • Introduced the concurrent open shop scheduling with coupled resources EECS 582 – F 16 14

Final Poster and Paper • Posters are a good way to interact with others

Final Poster and Paper • Posters are a good way to interact with others and get feedback • Mileage may vary, but its important to be able to talk about what you do • Research paper • The key part • Should be written similar to the papers you’ve read • As if you’d submit it to a workshop with ~3 more months of work or to a conference after ~6 more months of work • How to Write a Great Research Paper by Simon Peyton Jones 9/7/16 EECS 582 – F 16 15

Rough Outline [8 Pages w/o References] • • • Abstract Introduction (Highlight the importance

Rough Outline [8 Pages w/o References] • • • Abstract Introduction (Highlight the importance and give intuition of solution) Motivation (Use data and simple examples) Overview (Summarize your overall solution so that readers can follow later) Core Idea (Main contribution w/ challenges and how you address them) Implementation (Discuss non-obvious parts of your implementation) Evaluation (Convince readers that it works and when it fails) Related Work (Let readers know that you know your competition!) Discussion (Know your limitations and possible workarounds) Conclusion (Summarize and point out future work) 9/7/16 EECS 582 – F 16 16