Disk Router A Mechanism for High Performance Large

  • Slides: 25
Download presentation
Disk. Router : A Mechanism for High Performance Large Scale Data Transfers George Kola

Disk. Router : A Mechanism for High Performance Large Scale Data Transfers George Kola Computer Sciences Department University of Wisconsin-Madison kola@cs. wisc. edu http: //www. cs. wisc. edu/condor

Outline › › › Problem Disk. Router Overview Details Real life Disk. Routers Experiments

Outline › › › Problem Disk. Router Overview Details Real life Disk. Routers Experiments www. cs. wisc. edu/condor 2

Problem SDSC to NCSA Bottleneck Bandwidth : 12. 5 MBPS Latency 67 ms Transfer

Problem SDSC to NCSA Bottleneck Bandwidth : 12. 5 MBPS Latency 67 ms Transfer Rate got by applications for a 1 GB file Scp : 0. 66 MBPS Grid. FTP(1 stream) : 0. 85 MBPS Grid. FTP(10 streams) : 3. 52 MBPS www. cs. wisc. edu/condor 3

Disk. Router Overview › Mechanism to efficiently move large › › › amounts of

Disk. Router Overview › Mechanism to efficiently move large › › › amounts of data (order of terabytes) Uses disk as a buffer to aid in large scale data transfers Application-level overlay network used for routing Ability to use higher level knowledge for data movement www. cs. wisc. edu/condor 4

A Simple Case A B A is transferring a large amount of data to

A Simple Case A B A is transferring a large amount of data to B www. cs. wisc. edu/condor 5

A Simple Case A B C Disk. Router C is an intermediate node between

A Simple Case A B C Disk. Router C is an intermediate node between A and B www. cs. wisc. edu/condor 6

A Simple Case with Disk. Router A B C With Disk. Router Without Disk.

A Simple Case with Disk. Router A B C With Disk. Router Without Disk. Router Improves performance when bandwidth fluctuation between A and C is independent of the bandwidth fluctuation between C and B www. cs. wisc. edu/condor 7

Data Mover/Distributed Cache Source Disk. Router Cloud Destination Source writes to the closest Disk.

Data Mover/Distributed Cache Source Disk. Router Cloud Destination Source writes to the closest Disk. Router and Destination receives it up from its closest Disk. Router www. cs. wisc. edu/condor 8

Outline › › › Problem Disk. Router Overview Details Real life Disk. Routers Experiments

Outline › › › Problem Disk. Router Overview Details Real life Disk. Routers Experiments www. cs. wisc. edu/condor 9

Routing Between Disk. Routers Disk. Router C Disk. Router A Disk. Router B C

Routing Between Disk. Routers Disk. Router C Disk. Router A Disk. Router B C need not be in the path between A and B www. cs. wisc. edu/condor 10

Network Monitoring › Uses ‘Pathrate’ for estimating network › › › capacity Performs actual

Network Monitoring › Uses ‘Pathrate’ for estimating network › › › capacity Performs actual transfers for measurement Logging the data rate seen by different components Generate network interface stats on the machines involved in the transfers www. cs. wisc. edu/condor 11

Implementation Details › Uses multiple sockets and explicitly › sets TCP buffer sizes Overlaps

Implementation Details › Uses multiple sockets and explicitly › sets TCP buffer sizes Overlaps disk I/O and socket I/O www. cs. wisc. edu/condor 12

Client Side › Client library provided › Applications can call library functions › ›

Client Side › Client library provided › Applications can call library functions › › for network I/O Functions provided for common case file transfer (overlaps network I/O and disk I/O) Third party transfer support www. cs. wisc. edu/condor 13

Outline › › › Problem Disk. Router Overview Details Real life Disk. Routers Experiments

Outline › › › Problem Disk. Router Overview Details Real life Disk. Routers Experiments www. cs. wisc. edu/condor 14

Real Life Disk. Routers UW Milwaukee SDSC NCSA 90 Mbps INFN Italy 411 Mbps

Real Life Disk. Routers UW Milwaukee SDSC NCSA 90 Mbps INFN Italy 411 Mbps 3. 3 ms 8 ms 94 Mbps 518 Mbps 30 Mbps 2. 7 ms 67 ms 126. 6 ms 90 Mbps 514 Mbps 5. 5 ms UW Madison Star. Light 0. 85 ms MCS ANL www. cs. wisc. edu/condor 15

Outline › › Overview Details Real Life Disk. Routers Experiments www. cs. wisc. edu/condor

Outline › › Overview Details Real Life Disk. Routers Experiments www. cs. wisc. edu/condor 16

Testing Multiroute UW Milwaukee 90 Mbps 411 Mbps 3. 3 ms 8 ms 90

Testing Multiroute UW Milwaukee 90 Mbps 411 Mbps 3. 3 ms 8 ms 90 Mbps 5. 5 ms UW Madison Star. Light www. cs. wisc. edu/condor 17

Multiroute Improves Performance Megabits/second Total Data into Starlight Data From Milwaukee Data From Madison

Multiroute Improves Performance Megabits/second Total Data into Starlight Data From Milwaukee Data From Madison www. cs. wisc. edu/condor 18

SRB to Unitree Transfer Using Stork › Data movement from SDSC to NCSA ›

SRB to Unitree Transfer Using Stork › Data movement from SDSC to NCSA › › via Starlight (3 TB of data had to be moved) Integrated into Stork Found significant performance gain www. cs. wisc. edu/condor 19

Link between SDSC and NCSA 94 Mbps 2. 7 ms SDSC 518 Mbps 67

Link between SDSC and NCSA 94 Mbps 2. 7 ms SDSC 518 Mbps 67 ms Star. Light www. cs. wisc. edu/condor 20

Starlight Disk. Router Stats Data Inflow Data Outflow Memory Used Disk Used www. cs.

Starlight Disk. Router Stats Data Inflow Data Outflow Memory Used Disk Used www. cs. wisc. edu/condor 21

Grid. FTP vs Disk. Router End-to-End Data Rate Seen by Stork(MBPS) vs. Time Disk.

Grid. FTP vs Disk. Router End-to-End Data Rate Seen by Stork(MBPS) vs. Time Disk. Router Megabytes/second Grid. FTP www. cs. wisc. edu/condor 22

A Glimpse of Performance Transfer of 1 GB file from SDSC (San. Diego) to

A Glimpse of Performance Transfer of 1 GB file from SDSC (San. Diego) to NCSA (Urbana-Champaign) Tool Transfer Rate Scp 0. 66 MBPS Grid. FTP(1 stream) 0. 85 MBPS Grid. FTP(10 streams) 3. 52 MBPS Disk. Router 10. 77 MBPS www. cs. wisc. edu/condor 23

Work In Progress › Computation on data streams in the › › Disk. Router

Work In Progress › Computation on data streams in the › › Disk. Router Ability to perform computation in the nodes attached locally to the Disk. Router Working together with Stork to add intelligence to data movement www. cs. wisc. edu/condor 24

Questions › Thanks for listening www. cs. wisc. edu/condor 25

Questions › Thanks for listening www. cs. wisc. edu/condor 25