ROBIN Project Update Dr Wenji Wu wenjifnal gov
ROBIN Project Update Dr. Wenji Wu (wenji@fnal. gov), Fermilab LHCOPN-LHCONE meeting #46 Monday, February 14, 2022
Many people’s hard work FNAL: Wenji Wu, Liang Zhang, Qiming Lu, Amy Jin, Phil De. Mar i. CAIR/Star. Light: Joe Mambretti, Se-young Yu, Fei Yeh, Jim-Hao Chen ESnet: Inder Monga, Xi Yang, Tom Lehman, Chin Guok, John Macauley
Outline • Objectives • Evaluation Methodology • Testbed features and configurations • ROBIN Deployment and Configurations • Rucio/FTS Deployment and Configurations • Results • Conclusion and Future Plans
Objectives: Evaluate and Compare two data service platforms Rucio/Big. Data Express/SENSE (ROBIN) vs. Rucio/FTS The Next Generation Data Service Platform Rucio/Big. Data Express/SENSE (ROBIN) Existing Data Service Platform Rucio/FTS
Evaluation Methodology - 1 • Deploy ROBIN and RUCIO/FTS on a trans-Atlantic international testbed to evaluate and compare • A Trans-Atlantic International Testbed • Two administratively independent sites • The Star. Light International/National Communication Exchange Facility in Chicago • The CERNLight Open Exchange in Switzerland • A dedicated layer-2 WAN circuit connects the two sites
Evaluation Methodology - 2 • Run data transfer between Starlight and CERN DTNs • Performance metrics • Data Transfer Throughput • Five Scenarios • • • Single large file: 20 GB Group of small files: Linux source tree 4. 4. 9 ( 718 MB Total, 53351 files, Max. 2 MB) Dataset-10%: mixed of large (10 GB), medium ( 10 MB ), and small (1 MB) files, total size 200 GB with 10% of small files Dataset-20%: mixed of large (10 GB), medium ( 10 MB ), and small (1 MB) files, total size 200 GB with 20% of small files Dataset-40%: mixed of large (10 GB), medium ( 10 MB ), and small (1 MB) files, total size 200 GB with 40% of small files # of Small files # of Medium files # of Large files Dataset-10% 20000 8000 10 Dataset-20% 40000 8 Dataset-40% 80000 6
Testbed Features and Configurations – DTN CERN: Role Host Name Public Network Private Network Storage dtn 04. cern. ch 192. 91. 245. 29 (enp 1 s 0 ) 1 Gb/s 10. 250. 38. 200 (enp 4 s 0 f 0. 2038@enp 4 s 0 f 0) 40 Gb/s /dev/sdc (SSD) Role Host Name Public Network Private Network Storage DTN dtn 110. sl. starta p. net 165. 124. 33. 142 (management) 1 Gb/s 10. 250. 38. 53 (vlan 2038@p 4 p 1) 100 Gb/s /dev/nvme 0 n 1 (SSD) DTN Starlight:
Testbed Features and Configurations – DTN (cont. ) Run “dd” to benchmark DTN Disk performance Host Disk Throughput dtn 04. cern. ch /dev/sdc (SSD) 259. 9 MB/s = 2. 079 Gb/s dtn 110. sl. startap. net /dev/nvme 0 n 1 (NVME-SSD) 1091 MB/s = 8. 728 Gb/s Testing scripts SSD testing tool, FIO, shows similar metrics.
Testbed Features and Configurations – Network Run “iperf” to benchmark Network Throughput between Star. Light <--> CERN Tool Source Destination Public – Public Network Private – Private Network iperf 3 dtn 110. sl. startap. net dtn 04. cern. ch 0. 94 Gb/s = 120. 3 MB/s 7. 80 Gb/s = 998. 4 MB/s Platform Parameters Testing scripts
ROBIN Deployment and Configuration n Terminals and Web UI 1 Rucio Client @ mac-131933 4 BDE/mdtm. FTP 2 BDE-Rucio SQL Web server Rucio Daemons Big. Data Express Plugin BDE-Rucio 3 Postgres Rucio Cores BDE/mdtm. FTP Starlight Rucio Server@ wwportal: 8443 MDTMFTP client BDE Agent Files MDTMFT Server wwportal Starlight dtn 110. sl. startap. net CERN cixp-surfnet-dtn. cern. ch CERN dtn 04. cern. ch Home mac-131933. local BDE Rucio Extension control plane BDE DTN @ dtn 110. sl. startap. net Starlight -dtn. cern. ch: 5000 BDE Rucio Extension BDE Launcher Hostname BDE Head @ cixp-surfnet BDE Head @wwportal: 5000 BDE Portal BDE Server Location SENSE Service (data plane: VLAN 2038) BDE Portal BDE Server BDE Launcher MDTMFTP client BDE DTN @dtn 04. cern. ch BDE Agent Files MDTMFT Server CERN
Rucio/FTS Deployment and Configuration 1 2 Rucio Client @ mac-131933 Rucio Server@ wwportal: 8443 Postgres Web server SQL Rucio Cores Rucio Daemons FTS Client Starlight Location Hostname Starlight wwportal Starlight dtn 110. sl. startap. net CERN cixp-surfnet-dtn. cern. ch CERN dtn 04. cern. ch Home mac-131933. local CERN FTS Service @wwportal: 8446 FTS Server 3 gfal 2 Xrootd client XRoot. D @ DTN dtn 110. sl. startap. net Files XRoot. D Server control plane 4 SENSE Service (data plane: VLAN 2038) XRoot. D @ DTN dtn 04. cern. ch XRoot. D Server Files
ROBIN: Data Transfer Job Submission • ROBIN submits transfer jobs through Rucio CLI commands • Transfer jobs and related files are tracked and managed by Rucio services.
ROBIN: Data Transfer Status ROBIN provides visualized view of the data transfer status through Web GUI.
Result 1: Transfer Speed Rucio/FTS vs. ROBIN Transfer Speed Rucio/FTS 300 30 268, 55 249, 00 243, 98 215, 35 Transfer Speed ( MB/s) Rucio/FTS 26, 99 25 200 163, 90 150 100 Transfer Speed ( MB/s ) 250 ROBIN 20 15 10 5 50 12, 71 11, 45 6, 72 0 0 20 GB Dataset-10% Dataset-20% Dataset-40% 0, 04 Linux 4. 4. 9 ROBIN
Result 2: Comparative Analysis Rucio/FTS vs. ROBIN Transdfer Speed Rucio/FTS vs. ROBIN Transfer Speed 3500% Rucio/FTS ROBIN 80000% 3205% 70000% 3000% 2113% 2131% 2000% Ratio ( % ) ROBIN 67475% 60000% 2500% 1000% 500% Rucio/FTS 50000% 40000% 30000% 20000% 100%152% 100% 0% 10000% 100% 0% 20 GB Dataset-10% Dataset-20% Dataset-40% Linux 4. 4. 9 ROBIN outperforms Rucio/FTS significantly!
Conclusion and Future Plans • ROBIN outperforms Rucio/FTS significantly! • Future plans • Continue to test/evaluate ROBIN • • • 100 Gbps international WAN paths High-end DTNs Multiple site deployment Increased automation Enhanced parameter analytics
Questions? Additional Information [1] Rucio: https: //rucio. cern. ch/ [2] Big. Data Express: http: //bigdataexpress. fnal. gov [3] SENSE: http: //sense. es. net
- Slides: 17