A scheme to implement local server computation on

  • Slides: 14
Download presentation
A scheme to implement local server computation on EOS system based on Xrootd plug-in

A scheme to implement local server computation on EOS system based on Xrootd plug-in Speaker: Minxing Zhang 2021/12/23 IHEP-CC 1

Motivation • Large scale high energy physics experiments will generate hundreds of PB data

Motivation • Large scale high energy physics experiments will generate hundreds of PB data - need better I/O bandwith • The "storage wall" problem caused by the separation of storage and computation in classical von Neumann architecture - need less amounts of data moved between storage and computing platforms • Using a computational storage architecture can effectively reduce data movement - Reduce network transmission, reduce cross-node communication

Related Studies • Computational Storage • CSS • Local Computing Services • At the

Related Studies • Computational Storage • CSS • Local Computing Services • At the software level • CSP • The function of transparency to the upper layer • Located on the hard drive • CSD • FPGA • Matrix operation, encryption operation, etc • CSA • The computational storage nodes that make up the array

System Design • Traditional distributed systems use files for computing: • Three network communications

System Design • Traditional distributed systems use files for computing: • Three network communications • Two large file transmission • These calculations are performed using computational storage • One network communication • No large file transmission

System Design Client • Based on EOS and Xrootd implementations xrootd/http CSS. MGM •

System Design Client • Based on EOS and Xrootd implementations xrootd/http CSS. MGM • In general, it still looks like an EOS storage system, reading and writing data normally MGM NS async MQ • When the function needs to be called, the user appends a special parameter to the Open path sync async CSS. FST • The corresponding new file is then generated locally at the FST FST disk CSD CSD CSD disk

System Design • The process of opening a file in EOS is generally as

System Design • The process of opening a file in EOS is generally as follows: • The client receives an access request and sends a list of queries to the MGM. • The MGM then checks the metadata, finds the FST where the file resides, and returns the client redirect result. • The client receives the redirect request and forwards it to the FST where the file is located.

Implementation Client xrootd/http • Based on the EOS file opening process, we modified MGM

Implementation Client xrootd/http • Based on the EOS file opening process, we modified MGM and FST respectively. CSS. MGM • CSS. MGM: When the “&CSS” flag appears in the file MGM NS async path passed by the client, restore the file name to the file name normally accessed, and continue to MQ provide the entire modified information to the MGM. • CSS. FST: When the “&CSS” flag appears in the file path passed by the client, CSS_Open is called to start the computable storage service. At the same time, the file path is modified, and the whole sync async CSS. FST FST disk CSD CSD CSD disk

Implementation CSS. MGM CSS. FST

Implementation CSS. MGM CSS. FST

Implementation CSS. FST:

Implementation CSS. FST:

Implementation • The main functionality of the CSS described above is encapsulated in the

Implementation • The main functionality of the CSS described above is encapsulated in the Xrd. Css. Ofs. File class. • To deploy flexible and independent updates, we constructed a link library lib. Eos. Fst. Css. so • Create EOS/CSS/CMakelists. txt • Create an Eos. Fst. Css module • Then modify EOS/FST/CMakelists. txt • add_subdirectory(CSS) • Add in target_link_libraries of Xrd. Eos. Fst and Xrd. Eos. Fst-static • Eos. Fst. Css

Results • We tested it on a server with SSD and HDD CPU 32

Results • We tested it on a server with SSD and HDD CPU 32 Intel(R) Xeon(R) CPU E 5 -2683 v 4 @ 2. 10 GHz • CPU: 32 Intel(R) Xeon(R) CPU E 5 -2683 v 4 @ 2. 10 GHz Network Interface Card • Raw data file size : 953. 7 MB 1 Gbps Storage device SSD(SATA 3. 2, 6. 0 Gb/s) HDD(SATA 3. 2, 2. 1 Gb/s) EOS version 4. 7. 7 Raw Data File Size 953. 7 MB • Decode: Computational functions for testing, I/O intensive tasks • We tested the decode of the raw data using traditional mode via remote communication and the decode of the raw data using CSS.

Results In the figure below, the abscissa is the number of parallel programs, and

Results In the figure below, the abscissa is the number of parallel programs, and the ordinate is the final time of completion of all programs. Decode Test CSS. SSD(s) Decode Test The traditional model(s) CSS. HDD(s) The traditional model(s) 309, 411 271, 372 230, 831 Time 234, 042 155, 079 107, 882 68, 971 125, 827 68, 751 69, 13 155, 079 107, 882 69, 223 75, 882 90, 042 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Number of concurrent programs 70, 352 157, 882 125, 827 78, 751 120, 223 93, 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Number of concurrent programs The lower the final time of program operation is, the higher the computational efficiency of the mode is. The smaller the change of the final time of program operation is, the stronger the parallel ability of the mode is.

Conclusion and Outlook • Computational storage architectures have a good acceleration effect for I/O

Conclusion and Outlook • Computational storage architectures have a good acceleration effect for I/O intensive computations and can increase the amount of parallel tasks. • But we still need to modify the EOS source code. The next step is to fully encapsulate CSS functionality as plug-ins, enabling plug-and-play. • The types of CSS provided by extensions are also goals for future implementations.

Thank You

Thank You