Scalability of Linux EventDispatch Mechanisms Abhishek Chandra University
Scalability of Linux Event-Dispatch Mechanisms Abhishek Chandra University of Massachusetts Amherst Computer Science David Mosberger Hewlett Packard Labs Palo Alto
Motivation Web Server WAN r Large Web and Internet traffic r Heavily Loaded/Accessed Web Servers • cnn. com, britneyspearsfans. com, … • Starr Report, Napster ruling, . . . r Challenge: Make Web Servers Scalable Computer Science Clients
Server Scalability Issues r Large number of concurrent/idle connections m Last-mile problem: Slow end-connections m High latency WAN traffic m HTTP/1. 1 Persistent Connections r Heavy Request Loads m Need for high throughput r Pure Thread-based vs. Event-based servers r Focus: Scalability of Event-based servers on Linux Computer Science
Outline ü Motivation r Event-based Servers r Linux Event-Dispatch Mechanisms r Evaluation: Handling concurrent connections r RT signals and Signal-per-fd enhancement r Evaluation: Handling request load r Concluding Remarks Computer Science
Event-Based Servers 1. Interest Set Specification 3. Event Notification 4. I/O Handling Server Kernel Interest Set Connections 2. Network Event r Server specifies Interest Set to Kernel r Kernel notifies Server of Event on a connection r Server handles I/O on the connection Computer Science
Linux Event-Dispatch Mechanisms r select() system call r poll() system call r /dev/poll interface r POSIX. 4 Real-Time Signals Computer Science
select() system call Server Interest Set Ready Set Connections Scan r Interest Set specified on each call r Notification requires scan of interest set Computer Science Kernel
poll() and /dev/poll r Interest Set: m List of pollfd structures • Better for sparse interest sets, worse for dense sets r Notification m Requires scan of Interest Set r /dev/poll: m Interest Set specified incrementally m More compact ready set Computer Science
POSIX. 4 Real Time Signals r RT signals are queued m Multiple signals of same type can be delivered r RT signals carry a data payload (siginfo) m Provides the context of the signal r sigwaitinfo() system call: m Dequeues signals m Avoids overhead of calling signal handler m Signal can be blocked Computer Science
Using Real Time Signals for Network I/O 1. Associate RT Signal Server 4. sigwaitinfo() 3. En. Queue Signal Interest Set Signal Queue 5. De. Queue Signal Sockets 2. Network Event r Interest Set specified incrementally r No scanning of Interest Set required Computer Science Kernel
Outline ü Motivation ü Event-based Servers ü Linux Event-Dispatch Mechanisms r Evaluation: Handling concurrent connections r RT signals and Signal-per-fd enhancement r Evaluation: Handling request load r Concluding Remarks Computer Science
Evaluation: Handling Concurrent Connections r Dispatch overhead and latency as a function of number of concurrent connections r Experimental Setup m 400 MHz P 3 Linux 2. 4. 0 -test 7 server m μ -server using select(), /dev/poll or RT signals m 10 clients running httperf m Fixed request rate, increasing number of connections Computer Science
Server CPU Usage 500 req/s r RT signal overhead independent of no. of concurrent connections Computer Science
Response Time 500 req/s r RT signal response time independent of no. of concurrent connections Computer Science
Limitations of Real Time Signals Server 3 3 3 2 Drop Signal Queue Interest Set Sockets 1 2 3 4 Network Event r Signal Queue Overflow: • New events lost • Can lead to “hung server” r Unfair Allocation of Signal Queue Computer Science Kernel
Handling Signal Queue Overflow r Fallback mechanism poll(), etc. m Reconstruct current state r Issues m Server complexity m Overhead of maintaining explicit interest sets m Potential performance penalty m select(), Computer Science
RT Signal Enhancement: Signal-per-fd r Goals: m Avoid signal queue overflows m Fair Allocation of signal queue r Solution: Enqueue only one signal per socket Server Discard 1 3 2 Signal Queue Interest Set Sockets 1 Computer Science 2 3 4 Network Event Kernel
Signal-per-fd r Idea: m Signal queue length same as fdset size m Bitmap used to efficiently determine presence/absence of signal in queue r Advantages: m Simpler Server Implementation • No signal queue overflows • No need for fallback mechanisms m Fair Allocation of Signal Queue Resource m Avoids too fine-grained event notification • Coalesce multiple events for a socket Computer Science
Outline ü Motivation ü Event-based Servers ü Linux Event-Dispatch Mechanisms ü Evaluation: Handling concurrent connections ü RT signals and Signal-per-fd enhancement r Evaluation: Handling request load r Concluding Remarks Computer Science
Server Throughput 6000 idle connections r Linear scaling of RT signals, signal-per-fd Computer Science
Server CPU Usage 6000 idle connections r Linear Scaling of RT signals, signal-per-fd Computer Science
Related Work r Event-Delivery API [BMD 99] r Performance studies: [BM 98], /dev/poll [PL 00] m RT signals [PLT 00] r Web Servers: m Event-based: Flash [PDZ 99], phhttpd [Brown 99] m In-kernel: TUX, khttpd, AFPA [JKNRT 01] r Future: Linux 2. 5 Asynchronous I/O? m select() Computer Science
Summary r Scalability issues with Linux Event-dispatch mechanisms r Real Time Signals are scalable • Performance independent of number of concurrent connections • Signal Queue Overflow Problems r Signal-per-fd enhancement • potentially improves performance • reduces server complexity • provides fairness r Patch available at http: //www. netli. com/links Computer Science
- Slides: 23