Threads vs Events 5204 Operating Systems Threads vs

  • Slides: 32
Download presentation
Threads vs. Events 5204 – Operating Systems

Threads vs. Events 5204 – Operating Systems

Threads vs. Events Forms of task management serial preemptive cooperative (interrupt) (yield) CS 5204

Threads vs. Events Forms of task management serial preemptive cooperative (interrupt) (yield) CS 5204 – Operating Systems 2

Threads vs. Events Programming Models event handler thread blocking operations events run-time/kernel Process address

Threads vs. Events Programming Models event handler thread blocking operations events run-time/kernel Process address space thread model event model CS 5204 – Operating Systems 3

Threads vs. Events Stack (really state) Management event manager A state B C state

Threads vs. Events Stack (really state) Management event manager A state B C state C B stack frame state A Run-time stack automatic manual CS 5204 – Operating Systems 4

Threads vs. Events Threads are Bad n Difficult to program Synchronizing access to shared

Threads vs. Events Threads are Bad n Difficult to program Synchronizing access to shared state ¨ Deadlock ¨ Hard to debug (race conditions, repeatability) ¨ n Break abstractions ¨ n Modules must be designed “thread safe” Difficult to achieve good performance simple locking lowers concurrency ¨ context switching costs ¨ n OS support inconsistent ¨ n semantics and tools vary across platforms/systems May not be right model ¨ Window events do not map to threads but to events CS 5204 – Operating Systems 5

Threads vs. Events are Bad- Threads are Good n Thread advantages ¨ ¨ ¨

Threads vs. Events are Bad- Threads are Good n Thread advantages ¨ ¨ ¨ n Avoids “stack ripping” to maintain application context Exception handling simpler due to history recorded in stack Exploits available hardware concurrency Events and Threads are duals Performance of well designed thread system equivalent to well designed event system (for high concurrency servers) ¨ Each can cater to the common control flow patterns (a call/return pattern is needed for the acknowledgement required to build robust systems) ¨ Each can accommodate cooperative multitasking ¨ Stack maintenance problems avoided in event systems and can be mitigated in thread systems ¨ CS 5204 – Operating Systems 6

Threads vs. Events Stack Ripping CS 5204 – Operating Systems 7

Threads vs. Events Stack Ripping CS 5204 – Operating Systems 7

Threads vs. Events Ripped Code CS 5204 – Operating Systems 8

Threads vs. Events Ripped Code CS 5204 – Operating Systems 8

Threads vs. Events Ousterhout’s conclusions CS 5204 – Operating Systems 9

Threads vs. Events Ousterhout’s conclusions CS 5204 – Operating Systems 9

Threads vs. Events Two approaches n Capriccio Each service request bound to an independent

Threads vs. Events Two approaches n Capriccio Each service request bound to an independent thread ¨ Each thread executes all stages of the computation ¨ n Seda Each thread bound to one stage of the computation ¨ Each service request proceeds through successive stages ¨ CS 5204 – Operating Systems 10

Threads vs. Events Cappricio n Philosophy Thread model is useful ¨ Improve implementation to

Threads vs. Events Cappricio n Philosophy Thread model is useful ¨ Improve implementation to remove barriers to scalability ¨ n Techniques User-level threads ¨ Linked stack management ¨ Resource aware scheduling ¨ n Tools Compiler-analysis ¨ Run-time monitoring ¨ CS 5204 – Operating Systems 11

Threads vs. Events Capriccio – user level threads yield Capriccio kernel n n n

Threads vs. Events Capriccio – user level threads yield Capriccio kernel n n n User-level threading with fast context switch Cooperative scheduling (via yielding) Thread management costs independent of number of threads (except for sleep queue) n n Capriccio polling scheduler asynch I/O scheduler kernel Intercepts and converts blocking I/O into asynchronous I/O Does polling to determine I/O completion CS 5204 – Operating Systems 12

Threads vs. Events Compiler Analysis - Checkpoints n n Call graph – each node

Threads vs. Events Compiler Analysis - Checkpoints n n Call graph – each node is a procedure annotated with maximum stack size needed to execute that procedure; each edge represents a call Maximum stack size for thread executing call graph cannot be determined statically ¨ ¨ n Insert checkpoints to allocate additional stack space (“chunk”) dynamically ¨ ¨ ¨ n Recursion (cycles in graph) Sub-optimal allocation (different paths may require substantially different stack sizes) On entry (e. g. , CO) On each back-edge (e. g. C 1) On each edge where the needed (maximum) stack space to reach a leaf node or the next checkpoints exceeds a given limit ( Max. Path) (e. g. , C 2 and C 3 if limit is 1 KB) Checkpoint code added by source-source translation CS 5204 – Operating Systems 13

Threads vs. Events Linked Stacks n main n ¨ ¨ A B Thread stack

Threads vs. Events Linked Stacks n main n ¨ ¨ A B Thread stack is collection of non-contiguous blocks (‘chunks”) Min. Chunk: smallest stack block allocated Stack blocks “linked” by saving stack pointer for “old” block in field of “new” block; frame pointer remains unchanged Two kinds of wasted memory n Two controlling parameters ¨ ¨ C n Internal (within a block) (yellow) External (in last block) (blue) Max. Path: tradeoff between amount of instrumentation and runtime overhead vs. internal memory waste Min. Chunk: tradeoff between internal memory waste and external memory waste Memory advantages ¨ ¨ Avoids pre-allocation of large stacks Improves paging behavior by (1) leveraging LIFO stack usage pattern to share chunks among threads and (2) placing multiple chunks on the same page CS 5204 – Operating Systems 14

Threads vs. Events Resource-aware scheduling n Blocking graph ¨ ¨ n Blocking graph formed

Threads vs. Events Resource-aware scheduling n Blocking graph ¨ ¨ n Blocking graph formed dynamically ¨ n Edge – exponentially weighted average resource usage Node – weighted average of its edge values (average resource usage of next edge) Resources – CPU, memory, stack, sockets Resource-aware scheduling: ¨ ¨ n Appropriate for long-running program (e. g. web servers) Scheduling annotations ¨ ¨ ¨ n Nodes are points where the program blocks Arcs connect successive blocking points Dynamically prioritize nodes/threads based on whether the thread will increase or decrease its use of each resource When a resource is scarce, schedule threads that release that resource Limitations ¨ ¨ ¨ Difficult to determine the maximum capacity of a resource Application-managed resources cannot be seen Applications that do not yield CS 5204 – Operating Systems 15

Threads vs. Events Performance comparison n Apache – standard distribution Haboob – event-based web

Threads vs. Events Performance comparison n Apache – standard distribution Haboob – event-based web server Knot – simple, threaded specially developed web server CS 5204 – Operating Systems 16

Threads vs. Events SEDA – Staged Event-Driven Architecture n Goals ¨ Massive concurrency n

Threads vs. Events SEDA – Staged Event-Driven Architecture n Goals ¨ Massive concurrency n n n ¨ Simplify constructing well-conditioned services n “well conditioned”: behaves like a simple pipeline offers graceful degradation, maintaining high throughput as load exceeds capacity provides modular architecture (defining and interconnecting “stages”) hides resource management details n ability to analyze and adapt to the request stream n thread pool sizing dynamic event scheduling n n n ¨ ¨ Introspection Self-tuning resource management n n required for heavily used web servers large spikes in load (100 x increase in demand) requires efficient, non-blocking I/O Hybrid model ¨ combines threads (within stages) and events (between stages) CS 5204 – Operating Systems 17

Threads vs. Events SEDA’s point of view Thread model and performance Event model and

Threads vs. Events SEDA’s point of view Thread model and performance Event model and performance CS 5204 – Operating Systems 18

Threads vs. Events SEDA - structure n n Event queue – holds incoming requests

Threads vs. Events SEDA - structure n n Event queue – holds incoming requests Thread pool ¨ ¨ n Event handler ¨ ¨ ¨ n takes requests from event queue and invokes event handler Limited number of threads per stage Application defined Performs application processing and possibly generates events for other stages Does not manage thread pool or event queue Controller – performs scheduling and thread management CS 5204 – Operating Systems 19

Threads vs. Events Resource Controllers Thread pool controller n Thread added (up to a

Threads vs. Events Resource Controllers Thread pool controller n Thread added (up to a maximum) when event queue exceeds threshold n Thread deleted when idle for a given period Batching controller n Adjusts batching factor: the number of event processed at a time n High batching factor improves throughput n Low batching factor improves response time n Goal: find lowest batching factor that sustains high throughput CS 5204 – Operating Systems 20

Threads vs. Events Asynchronous Socket layer n n Implemented as a set of SEDA

Threads vs. Events Asynchronous Socket layer n n Implemented as a set of SEDA stages Each asynch. Socket stage has two event queues Thread in each stage serves each queue alternately based on time-out Similar use of stages for file I/O CS 5204 – Operating Systems 21

Threads vs. Events Performance n Apache ¨ n Flash ¨ ¨ n process-per-request design

Threads vs. Events Performance n Apache ¨ n Flash ¨ ¨ n process-per-request design event-drived design one process handling most tasks Haboob ¨ SEDA-based design Fairness n n n Measure of number of requests completed per client Value of 1 indicates equal treatment of clients Value of k/N indicates k clients received equal treatment and n-k clients received no service CS 5204 – Operating Systems 22

Threads vs. Events TAME • expressive abstractions for event-based programming • implemented via source-source

Threads vs. Events TAME • expressive abstractions for event-based programming • implemented via source-source translation • avoids stack ripping • type safety and composability via templates M. Krohn, E. Kohler, M. F. Kaashoek, “Events Can Make Sense, ” USENIX Annual Technical Conference, 2007, pp. 87 -100. CS 5204 – Operating Systems 23

Threads vs. Events A typical thread programming problem c f (blocking, synchronous operation, e.

Threads vs. Events A typical thread programming problem c f (blocking, synchronous operation, e. g. , I/O) Problem: the thread becomes blocked in the called routine (f) and the caller (c) is unable to continue even if it logically is able to do so. CS 5204 – Operating Systems 24

Threads vs. Events A partial solution c f (non-blocking, asynchronous operation, e. g. ,

Threads vs. Events A partial solution c f (non-blocking, asynchronous operation, e. g. , I/O) register handler (signal + data) Issues • Synchronization: how does the caller know when the signal has occurred without busy-waiting? • Data: how does the caller know what data resulted from the operation? CS 5204 – Operating Systems 25

Threads vs. Events A “Tame” solution c f e (1) (2) (4) (5) (11)

Threads vs. Events A “Tame” solution c f e (1) (2) (4) (5) (11) (3) (non-blocking, asynchronous operation, e. g. , I/O) (10) r: <I> slot <T> a; (9) e(a): <T> handler (7) a <- data wait point <I> (8) e. trigger(data) rendezvous<I> CS 5204 – Operating Systems <T> (signal + data) (6) event<T> 26

Threads vs. Events Tame Primitives CS 5204 – Operating Systems 27

Threads vs. Events Tame Primitives CS 5204 – Operating Systems 27

Threads vs. Events An example tamed gethost_ev(dsname, event<ipaddr> e); CS 5204 – Operating Systems

Threads vs. Events An example tamed gethost_ev(dsname, event<ipaddr> e); CS 5204 – Operating Systems 28

Threads vs. Events Variations on control flow parallel control flow window/pipeline control flow CS

Threads vs. Events Variations on control flow parallel control flow window/pipeline control flow CS 5204 – Operating Systems 29

Threads vs. Events Event IDs & Composability CS 5204 – Operating Systems 30

Threads vs. Events Event IDs & Composability CS 5204 – Operating Systems 30

Threads vs. Events Closures f( …params…) { copy rendezvous<> r; tvars { …locals…}; copy

Threads vs. Events Closures f( …params…) { copy rendezvous<> r; tvars { …locals…}; copy closure twait(r); continue_here: r: } Smart pointers and reference counting insure correct deallocation of events, redezvous, and closures. CS 5204 – Operating Systems 31

Threads vs. Events Performance (relative to Capriccio) CS 5204 – Operating Systems 32

Threads vs. Events Performance (relative to Capriccio) CS 5204 – Operating Systems 32