International conference on distributed multimedia systems DMS 05
International conference on distributed multimedia systems (DMS’ 05), Banff, Canada, September 2005 Assessment of Data Path Implementations for Download and Streaming Pål Halvorsen 1, 2, Tom Anders Dalseng 1 and Carsten Griwodz 1, 2 1 Department of Informatics, University of Oslo, Norway 2 Simula Research Laboratory, Norway
Overview ü Motivation ü Existing mechanisms in Linux ü Possible enhancements ü Summary and Conclusions DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Delivery Systems Network bus(es) DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Delivery Systems application user space kernel space file system communication system bus(es) DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Intel Hub Architecture F several in-memory data movements and context switches Pentium 4 Processor registers application file system communication system disk network card cache(s) memory controller hub RDRAM file system RDRAM communication system RDRAM application RDRAM I/O controller hub DMS’ 05, Banff, Canada. September 2005 PCI slots network card PCI slots disk 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Motivation ü Data copy operations are expensive Ø consume CPU, memory, hub, bus and interface resources (proportional to data size) Ø profiling shows that ~40% of CPU time is consumed by copying data between user and kernel Ø gap between memory and CPU speeds increase Ø different access times to different banks ü System calls makes a lot of switches between user and kernel space DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Basic Idea of Zero–Copy Data Paths application user space kernel space file system data_pointer communication system data_pointer bus(es) DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Motivation ü Data copy operations are expensive Ø consume CPU, memory, hub, bus and interface resources (proportional to data size) Ø profiling shows that ~40% of CPU time is consumed by copying data between user and kernel Ø gap between memory and CPU speeds increase Ø different access times to different banks ü System calls makes a lot of switches between user and kernel space ü A lot of research has been performed in this area!!!! ü BUT, what is the status today of commodity operating systems? DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Existing Linux Data Paths
Content Download application user space kernel space file system communication system bus(es) DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Content Download: read / send application buffer read send kernel copy page cache DMA transfer Ø Ø copy socket buffer DMA transfer 2 n copy operations 2 n system calls DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Content Download: mmap / send application mmap send kernel page cache DMA transfer Ø Ø copy socket buffer DMA transfer n copy operations 1 + n system calls DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Content Download: sendfile application sendfile kernel gather DMA transfer page cache append descriptor socket buffer DMA transfer Ø 0 copy operations Ø 1 system calls DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Content Download: Results ü Tested transfer of 1 GB file on Linux 2. 6 ü Both UDP (with enhancements) and TCP UDP DMS’ 05, Banff, Canada. September 2005 TCP 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Streaming application user space kernel space file system communication system bus(es) DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Streaming: mmap / send application buffer mmap cork send uncork kernel copy page cache DMA transfer Ø Ø copy socket buffer DMA transfer 2 n copy operations 1 + 4 n system calls DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Streaming: mmap / writev application buffer mmap writev kernel copy page cache DMA transfer Ø Ø copy socket buffer DMA transfer 2 n copy operations 1 + n system calls Previous solution three less calls per packet DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Streaming: sendfile application buffer cork sendfile kernel uncork gather DMA transfer page cache append descriptor copy socket buffer DMA transfer Ø Ø n copy operations 4 n system calls DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Streaming: Results ü Tested streaming of 1 GB file on Linux 2. 6 ü RTP over UDP Compared to not sending an RTP header over UDP, we get an increase of 29% (additional send call) More copy operations and system calls required potential for improvements TCP sendfile (content download) DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Enhanced Streaming Data Paths
Enhanced Streaming: mmap / msend application mmap application buffer cork send kernel DMA transfer Ø Ø msend uncork gather DMA transfer page cache msend allows to send data from an mmap’ed file without copy appendcopy descriptor copy socket buffer DMA transfer n copy operations Previous solution one more copy per packet 1 + 4 n system calls DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Enhanced Streaming: mmap / rtpmsend application mmap application buffer cork send kernel sendrtpmsend uncork gather DMA transfer page cache RTP header copy integrated into msend system call append descriptor copy socket buffer DMA transfer Ø Ø n copy operations 1 + n system calls previous solution require three more calls per packet DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Enhanced Streaming: mmap / krtpmsend application buffer An RTP engine in the kernel adds RTP headers rtpmsend kernel gather DMA transfer copy RTP engine page cache append descriptor socket buffer DMA transfer Ø Ø 0 copy operations previous solution require one more copy per packet 1 system call previous solution require one more call per packet DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Enhanced Streaming: rtpsendfile application buffer cork send RTP header copy integrated into sendfile system call sendfile rtpsendfile uncork kernel gather DMA transfer page cache append descriptor copy socket buffer DMA transfer Ø Ø n copy operations n system calls existing solution require three more calls per packet DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Enhanced Streaming: krtpsendfile application buffer An RTP engine in the kernel adds RTP headers rtpsendfile kernel gather DMA transfer copy RTP engine page cache append descriptor socket buffer DMA transfer Ø Ø 0 copy operations previous solution require one more copy per packet 1 system call previous solution require one more call per packet DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Enhanced Streaming: Results ü Tested streaming of 1 GB file on Linux 2. 6 ü RTP over UDP sendfile based mechanisms sm ani ing ) eam ad) nlo nt d (co nte TCP ow sen d file (str Exi stin gm ech rov imp 7% ~2 ~2 5% imp rov em em ent mmap based mechanisms DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Conclusions ü Current commodity operating systems still pay a high price for streaming services ü However, small changes in the system call layer might be sufficient to remove most of the overhead ü Conclusively, commodity operating systems still have potential for improvement with respect to streaming support ü What can we hope to be supported? ü Road ahead: optimize the code, make patch and submit to kernel. org DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
Questions? ? DMS’ 05, Banff, Canada. September 2005 Pål Halvorsen, Tom Anders Dalseng & Carsten Griwodz
- Slides: 28