From CIEL to Firmament DIOS a heavenly tale

  • Slides: 45
Download presentation
From CIEL to Firmament & DIOS a heavenly tale of not just clouds Joint

From CIEL to Firmament & DIOS a heavenly tale of not just clouds Joint work with: Steven Hand Anil Madhavapeddy Chris Smowton Steven Smith Derek Murray (MSRSVC)

Disclaimer

Disclaimer

[NSDI 2011] Recap: CIEL

[NSDI 2011] Recap: CIEL

A B M M M R R R G

A B M M M R R R G

Dynamic task graphs • Allow tasks to spawn more tasks a b T x

Dynamic task graphs • Allow tasks to spawn more tasks a b T x M R G

Experiment from D. Murray, A distributed execution engine supporting data-dependent control flow. Ph. D

Experiment from D. Murray, A distributed execution engine supporting data-dependent control flow. Ph. D thesis, University of Cambridge, 2011.

[interlude] polyglot CIEL

[interlude] polyglot CIEL

[unpublished] polyglot CIEL

[unpublished] polyglot CIEL

Saving state – options heavyweight VM migration lightweight BLCR (process checkpoint. ) Serializable continuations

Saving state – options heavyweight VM migration lightweight BLCR (process checkpoint. ) Serializable continuations Haskell monads CIEL hardware / OS level application level

Java Scala Haskell Stackless Python OCaml (C with BLCR) no need for Skywriting!

Java Scala Haskell Stackless Python OCaml (C with BLCR) no need for Skywriting!

Binomial options pricing Experiment from D. Murray, C. Smowton, M. Schwarzkopf, S. Smith, A.

Binomial options pricing Experiment from D. Murray, C. Smowton, M. Schwarzkopf, S. Smith, A. Madhavapeddy. A polyglot approach to cloud programming. Unpublished, 2011.

What‘s next? ! many-core clusters heterogeneity

What‘s next? ! many-core clusters heterogeneity

timespin on unmodified CIEL 41. 6 x rel. overhead less is better 5. 1

timespin on unmodified CIEL 41. 6 x rel. overhead less is better 5. 1 x 1. 3 x s d n co se nu mb er of cor es 1. 04 x

Enter Firmament and DIOS [Data-Intensive Operating System]

Enter Firmament and DIOS [Data-Intensive Operating System]

User code Programming Model C 1 st class exec. 2 nd Skywriting I E

User code Programming Model C 1 st class exec. 2 nd Skywriting I E Execution Engine L Host OS . . Hardware . . . Master W 0 W 1 Wn

User code C I E L Programming Model 1 st class exec. 2 nd

User code C I E L Programming Model 1 st class exec. 2 nd Skywriting Firmament: Coordination Engine DIOS Hardware . . .

Firmament multi-scale heterogeneity-aware

Firmament multi-scale heterogeneity-aware

How much heterogeneity is there?

How much heterogeneity is there?

Google trace, machine platforms

Google trace, machine platforms

Google trace, machine specs CPU cores (normalized) Total RAM (normalized)

Google trace, machine specs CPU cores (normalized) Total RAM (normalized)

Google trace, platforms + specs

Google trace, platforms + specs

Google trace, machine attributes

Google trace, machine attributes

Firmament Cluster knowledge base • historic task resource usage historic task performance info machine

Firmament Cluster knowledge base • historic task resource usage historic task performance info machine information Efficient runtime [Storage? Networking? Transfer management? ]

Firmament It’s real! • ~2 k LOC, basic tests run To. Do (aka WIP):

Firmament It’s real! • ~2 k LOC, basic tests run To. Do (aka WIP): • knowledge base design & impl. scheduling algorithms interface to CIEL

User code C I E L Programming Model 1 st class exec. 2 nd

User code C I E L Programming Model 1 st class exec. 2 nd Skywriting Firmament: Coordination Engine DIOS Hardware . . .

DIOS topology-aware interference-aware lightweight OS

DIOS topology-aware interference-aware lightweight OS

Heterogeneity [again!] Many-core => intra-machine communication = important!

Heterogeneity [again!] Many-core => intra-machine communication = important!

Intel Core i 7 -2600 K @ 3. 40 GHz (native) 48 -core AMD

Intel Core i 7 -2600 K @ 3. 40 GHz (native) 48 -core AMD Opteron 6168 (native) (Xen) Joint work with Steven Smith, Anil Madhavapeddy, and Chris Smowton; cf. “The case for reconfigurable I/O“ (RES

Intel Xeon E 5620 @ 2. 40 GHz (native) Different physical core Hyperthread Joint

Intel Xeon E 5620 @ 2. 40 GHz (native) Different physical core Hyperthread Joint work with Steven Smith, Anil Madhavapeddy, and Chris Smowton; cf http: //fable. io

Intel Core i 7 -2600 K @ 3. 40 GHz (native) Different physical core

Intel Core i 7 -2600 K @ 3. 40 GHz (native) Different physical core Hyperthread Joint work with Steven Smith, Anil Madhavapeddy, and Chris Smowton; cf http: //fable. io

AMD Opteron 6168 @ 1. 9 GHz (native) Same MCM, same socket Different MCM,

AMD Opteron 6168 @ 1. 9 GHz (native) Same MCM, same socket Different MCM, different socket, 2 -hop Hypertransport Joint work with Steven Smith, Anil Madhavapeddy, and Chris Smowton; cf http: //fable. io

Topology-awareness OS responsibility? Yes. General case = hard! Workload-awareness helps!

Topology-awareness OS responsibility? Yes. General case = hard! Workload-awareness helps!

hwlo c

hwlo c

Interference #include <results>

Interference #include <results>

Lightweight Make the OS do exactly (and just) what is needed. Dedicate resources instead

Lightweight Make the OS do exactly (and just) what is needed. Dedicate resources instead of sharing them.

Resource multiplexing I/O mgmt Isolation Pre-emption Multi-threading Concurrency primitives Locking Filesystem Shell Standard libs

Resource multiplexing I/O mgmt Isolation Pre-emption Multi-threading Concurrency primitives Locking Filesystem Shell Standard libs Process mgmt

Scheduling a G b T x

Scheduling a G b T x

Scheduling Firmament: Coordination Engine DIOS Program . . .

Scheduling Firmament: Coordination Engine DIOS Program . . .

DIOS Pieces exist • currently combining ; -) WIP: • interference experiments related work

DIOS Pieces exist • currently combining ; -) WIP: • interference experiments related work reading group starting point? (Linux or Xen? )

BACKUP SLIDES

BACKUP SLIDES

Binomial options pricing 800 k (EC 2) 800 k (MC) 400 k (EC 2)

Binomial options pricing 800 k (EC 2) 800 k (MC) 400 k (EC 2) higher is better 400 k (MC) 200 k (EC 2) 200 k (MC)

Redis example Numbers and experiment by Sören Bleikertz: http: //openfoo. org/blog/redis-native-xen. html

Redis example Numbers and experiment by Sören Bleikertz: http: //openfoo. org/blog/redis-native-xen. html