The LHCb Experiment Control System Clara Gaspar May
The LHCb Experiment Control System Clara Gaspar, May 2016
Control System Scope ❚ In charge of the Control and Monitoring of all areas of the experiment Experiment Control System Detector & General Infrastructure (Power, Gas, Cooling, etc. ) Detector Channels Trigger TFC Front End Electronics Readout Boards High Level Trigger (Farm) Storage Monitoring DAQ External Systems (LHC, Technical Services, Safety, etc. ) Clara Gaspar, May 2016 2
Control System Architecture ECS Status & Alarms Commands INFR. Sub. Det 1 LV TFC DCS Sub. Det 1 DCS Sub. Det 2 DCS Sub. Det 1 TEMP Sub. Det 1 GAS … Sub. Det. N DCS DAQ Sub. Det 1 FEE Sub. Det 2 DAQ Sub. Det 1 RO HLT … LHC Sub. Det. N DAQ Legend: Control Unit LV Dev 1 LV Dev 2 … LV Dev. N FEE Dev 1 FEE Dev 2 Clara Gaspar, May 2016 … FEE Dev. N Device Unit 3
LHC Experiments: JCOP ❚The JCOP Framework is based on: ❙SCADA System – Win. CC-OA for: Device Units Control Units ❘Device Description (Run-time Database) ❘Device Access (OPC, Profibus, drivers) ❘Alarm Handling (Generation, Filtering, Masking, etc) ❘Archiving, Logging, Scripting, Trending ❘User Interface Builder ❘Alarm Display, Access Control, etc. ❙SMI++ providing: ❘Abstract behavior modeling (Finite State Machines) ❘Automation & Error Recovery (Rule based system) Clara Gaspar, May 2016 4
Device Units Device Unit ❚Provide access to “real” devices: ❙The Framework provides (among others): ❘“Plug and play” modules for commonly used equipment. For example: � CAEN or Wiener power supplies (via OPC) � LHCb CCPC and SPECS based electronics (via DIM) ❘A protocol (DIM) for interfacing “home made” devices. For example: � Hardware devices like a calibration source � Software devices like the Trigger processes (based on LHCb’s offline framework – GAUDI) ❘Each device is modeled as a Finite State Machine Clara Gaspar, May 2016 5
Hierarchical control Control Unit ❚Each Control Unit: ❙Is defined as one or more Finite State Machines ❙Can implement rules based on its children’s states ❙In general it is able to: ❘Summarize information (for the above levels) ❘“Expand” actions (to the lower levels) ❘Implement specific behaviour & Take local decisions � Sequence & Automate operations � Recover errors ❘Include/Exclude children (i. e. partitioning) � Excluded nodes can run is stand-alone DCS Tracker DCS Muon LV … Muon GAS ❘User Interfacing � Present information and receive commands Clara Gaspar, May 2016 6
Hierarchical Control Tools OFF Switch_ON Switch_OFF ON Recover ERROR ❚ Build FSM hierarchy across different machines ❚ Dynamically generated Operation UIs ❙ Embedded Partitioning: Include, Exclude, etc. Clara Gaspar, May 2016 7
LHCb Operations ❚Main Tools: ❙Run. Control ❘Handles the DAQ & Dataflow ❘Allows to: ❘Configure the system ❘Start & Stop runs ❙Auto. Pilot ❘Knows how to start and keep a run going from any state. ❙Big. Brother ❘Based on the LHC state: ❘Controls SD Voltages ❘VELO Closure ❘Run. Control ❙Alarm. Screen Clara Gaspar, May 2016 8
Run Control ❚ Matrix Domain x Sub-detector ❚ Activity Used to configure all sub-system Clara Gaspar, May 2016 9
Alarm Screen Clara Gaspar, May 2016 10
Other Monitoring Tools Clara Gaspar, May 2016 11
ECS: Some numbers ❚Size of the Control Tree: ECS ❙Distributed over ~200 PCs ❘Mostly Linux (VMs) ❘Some Windows ❙>8000 Control Units ❙>50000 Device Units HV Sub. Det 1 DCS … TFC Sub. Det. N DCS DAQ Sub. Det 1 DAQ … HLT LHC Sub. Det. N DAQ ❚Run Control Timing ❙Cold Start to Running: 4 minutes ❘Configure all Sub-detectors, Start & Configure ~50000 HLT processes (always done well before PHYSICS) ❙Stop/Start Run: 6 seconds Clara Gaspar, May 2016 12
Conclusions ❚ LHCb has designed and implemented a coherent and homogeneous control system ❚ The complete experiment: ❙ Is operated by only 1 person ❙ Is almost completely automated (basically only confirmations from Operator) ❚ Thanks to the use of the JCOP Framework (and it’s many features, tools and components) ❙ The manpower needs are very low (and can be shared between sub-systems) ❙ The development time is quite short ❙ Training and support available and large user community Clara Gaspar, May 2016 13
Backup Clara Gaspar, May 2016 14
FSM Operation Domains ❚ DCS Domain ❚ HV Domain Equipment’s operation related to a running period (Ex: GAS, Cooling) ERROR Recover. Switch_OFF NOT_READY Equipment’s operation related to the LHC State (Ex: High Voltages) ERROR OFF Switch_ON Recover OFF Go_STANDBY 1 RAMPING_STANDBY 1 Switch_OFF READY STANDBY 1 Go_STANDBY 2 RAMPING_STANDBY 2 ❚ DAQ Domain STANDBY 2 Equipment’s operation related to a “RUN” (Ex: RO board, HLT process) ERROR NOT_READY Recover Go_READY RAMPING_READY UNKNOWN NOT_READY Configure CONFIGURING Reset ❚ LHCb FSM Templates Provided to all Sub-systems READY Start Stop RUNNING Clara Gaspar, May 2016 15
- Slides: 15