Parallel Detectors The Excalibur and Percival detector system
Parallel Detectors The Excalibur and Percival detector system use-cases Ulrik Kofoed Pedersen Senior Software Engineer Controls Development Team EPICS area. Detector Workshop, DLS October 2014
Overview • Excalibur – Parallel Detector – Implementation • HDF 5 – Single Writer Multiple Readers (SWMR) – Virtual Dataset • Percival – – – December 2021 The Collaboration Software Delivery Detector System Overview Data Readout Live processing Software Architecture 2
Excalibur Master Node Sensor/Medipix 3 Hybrids FEM Readout Node FEM Readout Node Network 10 Gig. E Detector head Optical Links • • Readout nodes all write in parallel Requires writing the 6 data streams into a single file We decided to use parallel HDF 5 (p. HDF 5) Parallel file system write performance is highly dependent on 1 MB aligned write operations • December 2021 The requirement to write blank pixels, representing the gaps between modules, completely wrecks performance 3
Excalibur Implementation • Each compute node runs an area. Detector IOC • The master node runs a master area. Detector IOC • Mater IOC synchronises the configuration and state of the node IOCs using database logic and SNL • Synchronisation logic: caput_callback or not? Timeout? PV disconnect? • The HDF 5 writer is a parallel service (not built into the IOC) • The adtransfer plugin moves data off the IOCs, pushing it to the parallel HDF 5 writer • p. HDF 5 does not work well when out of synch • Lustre writing is always quite a bit out of synch • The entire system is brittle and fault-finding is only possible by experts • Thousands of PVs and way more EDM screens than you can fit on even the largest tiled monitors December 2021 4
Single Writer Multiple Reader (SWMR) • High speed detectors write large files to avoid the overhead of lots of small files. • However, this means that data processing can’t start until a large amount of data has been generated. • SWMR addresses this with extensions to the HDF 5 file format to allow readers to have a coherent view of the file even as it is updated. • A prototype HDF 5 library with SWMR is currently under test at Diamond • Requires a fundamental change to the HDF 5 file format as well as the library • Due to be released as part of version 1. 10 December 2021 5
HDF 5 Virtual Dataset • Parent dataset in VDS. h 5 composed of data mapped from datasets in 6 subordinate files. • Subordinate datasets can be – Written independently and in parallel. – Compressed and chunked independently • Parent dataset can be read as normal. • Eliminates our need for p. HDF 5 • Being funded by Diamond, the Percival project and DESY December 2021 6
Percival Collaboration • Collaboration by DESY, Elettra, Diamond, STFC (contractor) • DESY: synchrotron and FEL – different control systems • Work packages: • DESY: data readout & transfer electronics and FW • Elettra: control electronics and FW • STFC: chip design • Diamond: software December 2021 7
Percival Software Delivery • Control system independent - provide integration API • Hide complexity of multiple nodes from user • Provide detector engineer control requirements first • Must support development tasks • Engineers to monitor, control and scan low-level detector ADC/DACs • End users: start/stop, scan integration with beamline equipment • Cut-down system to run on workstation in lab (i. e. scalable from 1 to N(=8) nodes) December 2021 8
Percival System 8 reference columns • High resolution detector • 3520 x 3700 pixels (P 13 M) @ 16 bpp • Framerate: 120 Hz • Aggregate datarate (incl. reset frames): 6 GB/s • Intermediate prototype version to be produced first: P 2 M • Frames cycle round the processing nodes in order • Each compute node only “see” full frames (temporal mode) • Nodes can operate independently of each other 8 x 742 704 x 742 704 x 742 704 x 742 704 x 742 704 x 742 8 x 742 704 x 742 704 x 7 704 x 7 22 22 11 11 22 22 Mezzanine Board (FPGA) 7 reference rows Mezzanine Board (FPGA) 10 Gbps Eth Deep buffer Ethernet switch 10 Gbps Eth Processing Nodes December 2021 9
Percival Data Readout • Each processing node handles the packets for a whole frame. • Frames cycle round the processing nodes in order December 2021 10
Percival Live Processing • Live processing requirements include • ROI and binning (optional) • ADC coarse, ADC Fine and Gain decoding • ADC correction • Common Mode Averaging (Optional) • Gain multiplication (and ADU to electron conversion) • CDS subtraction • Dark and flat field corrections • Compression (not defined yet) • HDF 5 file writing • Live streaming viewer (reduced rate and/or ROI) December 2021 11
Percival Software Architecture December 2021 12
Percival Initial SW Delivery Control Board Mezzanine Board Deep Buffer Switch Control Lib (python) Data lib (python) User Interface (python objects) CLI dawn Data Receiver Data Processing HDF 5 Writer (VDS) Central Storage December 2021 13
Percival Planned SW Delivery Mezzanine Board Control Board Mezzanine Board Deep Buffer Switch Control Lib (C++) Data lib (C++) Data Receiver Data Processing Control SDK (C++) Control System area. Detector/Lima User Interface (python objects) CLI HDF 5 Writer (VDS) dawn Central Storage December 2021 14
- Slides: 14