XRoot D Network Oriented Features Software Computing Round

  • Slides: 22
Download presentation
XRoot. D Network Oriented Features Software & Computing Round Table April 6, 2021 Andrew

XRoot. D Network Oriented Features Software & Computing Round Table April 6, 2021 Andrew Hanushevsky, SLAC http: //xrootd. org

What Is XRoot. D? XRoot. D A system for scalable cluster data access xrootd

What Is XRoot. D? XRoot. D A system for scalable cluster data access xrootd cmsd Data Access Data Clustering Not a file system & not just for file systems If you can write a plug-in you can cluster it n E. G. Used by Qserv for clustered my. SQL 2 April 6, 2021 Software & Computing Round Table 2

WYSIWYG Scalable Access Request routed to an alternate node exporting same logical name Nodes

WYSIWYG Scalable Access Request routed to an alternate node exporting same logical name Nodes arranged in a B 64 tree resource providers are leaf nodes open() redirect open() xrootd cmsd xrootd 641 = Manager Redirectors 64 cmsd xrootd cmsd 642 = xrootd cmsd Task: route a client request from Client top of the tree to a resource provider Exponentially Parallel Query For Logical Endpoint Name Supervisor Redirectors Routing Paths Cached At Each Router Node 4096 xrootd cmsd Resource Providers Request routing is very different from traditional data management models This implements a structured network of request routers (i. e. redirectors) Capable of automatically recovering from adverse conditions Much like internet routing it essentially implements an NDN April 6, 2021 Software & Computing Round Table 3

Brief history of the last 20 years 2001 – Ba. Bar decides to use

Brief history of the last 20 years 2001 – Ba. Bar decides to use root framework vs Objectivity 2002 Collaboration with INFN, Padova & SLAC created Design & develop a network-based HP data access system In the days of limited network b/w and high expense 2003 – First deployment of XRoot. D system at SLAC 2013 – Wide deployment across most of HEP Protocol also re-implemented (Java) in d. Cache 2021 – XRoot. D is now a popular internal framework Supports http, https, and xroots as well as xroot protocol Third party software projects use it; leading to the moniker “XRoot. D Inside!” April 6, 2021 Software & Computing Round Table 4

The network oriented features XRoot. D was developed for networks n The design goals

The network oriented features XRoot. D was developed for networks n The design goals were n Minimize n bandwidth usage Don’t send unnecessary data n Maximize n Optimally use what you have to the fullest extent n Work n n Be n n April 6, 2021 bandwidth utilization around network & server failures Automatic recovery whenever possible (usually can) flexible Adapt to the ever changing network configurations Let’s see what we did Software & Computing Round Table 5

Network bandwidth usage I Protocol has exceedingly low framing overhead n 24 bytes for

Network bandwidth usage I Protocol has exceedingly low framing overhead n 24 bytes for a request and 8 bytes for a response n Application data is typically 99. 99% of the packet Does it really matter? n Depends on who you are and what you are doing n If you sell bandwidth it’s a lousy protocol n n XRoot. D tries to minimize bandwidth waste If you buy bandwidth it definitely matter n When doing random small sized reads it likely matters • This is typical for many HEP/Astro analysis jobs n But when transferring multi-gigabyte files, not really Protocol can easily fill a 100 Gb pipe in aggregate n April 6, 2021 xrootd server architecture favors aggregate performance Software & Computing Round Table 6

Network bandwidth usage II Xcache may be used to further lower B/W usage n

Network bandwidth usage II Xcache may be used to further lower B/W usage n XRoot. D software component similar to Squid n Provides high performance multi-threaded disk file block caching n n Suitable for locales where data is reused n n Typically analysis farms that fetch data over the WAN Some sites have reported a 40% reduction of WAN usage n n On average there is a 20% reduction in typical HEP use cases Two factors in HEP make Xcache useful n Many applications only use 30 -50% of a file n n April 6, 2021 Xcache only transfers the part of the file that an application actually needs Analysis jobs are rerun several times with different parameters n n Something that Squid was not designed to do Much of the same data is needed in a subsequent run If that is not your use case then Xcache won’t help Software & Computing Round Table 7

Network bandwidth usage III Xcache can be configured to better use LAN resources n

Network bandwidth usage III Xcache can be configured to better use LAN resources n This is specific to HPC’s but the usual setup is as follows HPC Cluster File data read from internet in desired priority order Xcache @ DTN File data cached in Lustre Aggressive prefetching enabled Lustre April 6, 2021 Data delivered to job as soon as it arrives Slow TCP Analysis Job Fast RDMA Job redirected to Lustre when complete file is cached Software & Computing Round Table 8

Network bandwidth usage IV XRoot. D protocol supports data compression n Legacy feature from

Network bandwidth usage IV XRoot. D protocol supports data compression n Legacy feature from Objectivity/DB era n Motivation was to minimize disk space n n Only compressed was sent over the network n n Root files are already compressed so no gain in HEP n Objectivity/DB files could be compressed 20 to 50% For certain other file formats it may still be useful n Protocol allows enabling feature on a file basis n n Could be restricted to certain file types Additional client/server development would be needed n April 6, 2021 Client would decompress the data Feature has fallen to the wayside n n However, it also reduced bandwidth usage Current servers and clients no longer implement this feature Software & Computing Round Table 9

Network bandwidth usage V In XRoot. D 5. x provides data-in-motion integrity n Driven

Network bandwidth usage V In XRoot. D 5. x provides data-in-motion integrity n Driven by Xcache requirement to avoid caching dirty data Each 4 K block is protected by a CRC 32 C checksum n CRC 32 C was chosen largely because it is hardware assisted n However it is also excellent for the chosen data size unit Checksum errors are corrected on-the-fly n When reading the client requests retransmission n When writing the server requests retransmission April 6, 2021 Software & Computing Round Table 10

Network bandwidth usage VI Data-in-motion integrity n Potential to save significant network bandwidth n

Network bandwidth usage VI Data-in-motion integrity n Potential to save significant network bandwidth n Copying a 100 GB file and get a one bit error? n n n Currently, requires retransmission of the complete file With new feature only the 4 K block in error is retransmitted How often do network checksum errors occur? n Hard to tell as that statistic is not collected n n Informal observations indicate it’s variable and can be significant at times • Significant means impacting >1% of transfers in the period Beware that checksum errors can be indistinguishable from disk errors Data-at-rest integrity n Can configure XRoot. D to save network checksums n n April 6, 2021 Data can be checked upon reading (Xcache ) Network checksum can be reused for transfers Software & Computing Round Table 11

Network bandwidth utilization XRoot. D supports multiple data streams n An application may get

Network bandwidth utilization XRoot. D supports multiple data streams n An application may get up to 15 additional data streams n Useful for improving the speed of WAN file transfers n n This has been well documented and is a way to mitigate TCP recovery of dropped packets Multiple data streams are also used to mitigate TLS performance n n The protocol naturally splits into control and data streams n Control stream is encrypted n Data stream is not encrypted This is automatically handled for the application n Site requirements may force all data to be encrypted • This is negotiated between the client and server n April 6, 2021 This is a new feature in Release 5. x Software & Computing Round Table 12

Container orchestration support XRoot. D supports container orchestration n Typical ones are Kubernetes (k

Container orchestration support XRoot. D supports container orchestration n Typical ones are Kubernetes (k 8 s) or Swarm n Both introduce issues for network clustered services n Virtual networking n n n Dynamic DNS n IP addresses are dynamically added and removed n Registration is essentially ephemeral Supporting orchestration requires some rethinking n XRoot. D provides configurable options to address these issues n April 6, 2021 IP address is arbitrary and can unpredictably change Essentially, the IP address is no longer a useful management tool Software & Computing Round Table 13

Virtual networking support Virtual networks need virtual namespaces n XRoot. D implements such a

Virtual networking support Virtual networks need virtual namespaces n XRoot. D implements such a namespace n Site assigns accessible resources relative unique names n n Normally we think of a resource as a server but it’s no longer relevant n For file system based services it’s actually the file system n Any server can export any file system via orchestration n For non data services (e. g. via SSI) it’s usually the server Clustering component tracks resources by name not IP address n It also makes sure that the xrootd - cmsd pair is consistent n n April 6, 2021 That they are looking at the same file system which might not be the case anymore We do not recommend virtual networking due to overhead n Commercial cloud providers have substantially reduced the overhead n Open software solutions have not Software & Computing Round Table 14

Dynamic DNS support DNS entries are now a spur of the moment thing n

Dynamic DNS support DNS entries are now a spur of the moment thing n Orchestration frameworks register IP address whenever n n Registration can occur in any order irrespective of any other server If you tell xrootd’s and cmsd’s that DNS is dynamic n Mitigation is enabled for delayed registration n This prevents failures that would normally be expected to occur in a real network • For instance, a non-registered service is configured XRoot. D is very comfortable with the cloud n April 6, 2021 With containerization features sites have deployed cloud clusters Software & Computing Round Table 15

Other net oriented features Full-fledged clustered proxy server support n Scalable load-sensitive mechanism to

Other net oriented features Full-fledged clustered proxy server support n Scalable load-sensitive mechanism to deal with firewalls Configurable TCP keep alive support n Additionally, idle socket timeout with forced close n Addresses typical “close_wait” issues with certain VM clients Full support for public/private 4/6 IP networks n Site can optionally describe its IP address rules n Used by the clustering component to route requests n n Largely to accommodate HPC centers with unique networks n April 6, 2021 Automatic matching of compatible addresses for routing Can be used to minimize internal network hops Allows use of a preferred interface when possible Currently used at GSI, Darmstadt Software & Computing Round Table 16

Enhanced Write Support (backend) Distributed write recovery n For systems that support it (e.

Enhanced Write Support (backend) Distributed write recovery n For systems that support it (e. g. EOS) n Eliminates n full file retransmission upon error Writes can proceed using another data server • Normally writes are tied to the server of 1 st write Part of XRoot. D file copy framework n April 6, 2021 Automatically extends to gfal and xrdcp Software & Computing Round Table 17

Performance Improvements xrdcp Simplify buffer management n Use kernel space buffers n Approximately 3

Performance Improvements xrdcp Simplify buffer management n Use kernel space buffers n Approximately 3 -4 x reduction in CPU usage n Up to a 40% increase in transfer speed n n Depending April 6, 2021 on target device Software & Computing Round Table 18

Xcache. H plug-in (coming soon) Accessing Xcache origins using http[ http s] n Broadens

Xcache. H plug-in (coming soon) Accessing Xcache origins using http[ http s] n Broadens data access reach n Oriented n Can be used as a Squid replacement n Better n performance and scalability Based on the plug-in by Radu Popescu n Formerly n n April 6, 2021 toward multi-discipline sites at CERN now at Proton Tech AG Further developed by Wei Yang - SLAC Prototype being tested by ESNET & ESCAPE Software & Computing Round Table 19

Erasure coding client plug-in (coming soon) Client side plug-in to support EC writes n

Erasure coding client plug-in (coming soon) Client side plug-in to support EC writes n Based on Intel ISAL n Hardware n accelerated encoding Leverages XRoot. D pg. Write capability n Data in motion integrity with recoverability Driven by ALICE requirements n Direct writes from the DAQ system to EOS Developed by Michal Simon (CERN IT-ST-PDS) April 6, 2021 Software & Computing Round Table 20

Other developments (coming soon) Improved Ceph plug-in n Addition of more features n Vector

Other developments (coming soon) Improved Ceph plug-in n Addition of more features n Vector n reads/writes Being developed by RAL Packet marking n Labeling purpose of data in network packets n IPv 6 n n April 6, 2021 only Won’t work in clouds or k 8 s clusters with virtual net XRoot. D will be used as a demonstrator Software & Computing Round Table 21

Conclusion XRoot. D is facile, flexible, and sound n Applicable to a wide variety

Conclusion XRoot. D is facile, flexible, and sound n Applicable to a wide variety of problems n Current n release is 5. 1. 1 Next release 5. 2. 0 at the end of April This talk was network focused n XRoot. D has many other features n Checkout the web site xrootd. org Questions? Funding from US Department of Energy contract DE-AC 02 -76 SF 00515 with Stanford University April 6, 2021 Software & Computing Round Table 22