STAMP Analysis of the HLLHC Inner Triplet Protection

  • Slides: 17
Download presentation
STAMP Analysis of the HL-LHC Inner Triplet Protection TE-MPE-TM #100, Dennis Hügle October 2017

STAMP Analysis of the HL-LHC Inner Triplet Protection TE-MPE-TM #100, Dennis Hügle October 2017 30/09/2020 1

Agenda • • Introduction and Task Methodology • • Method Structure/Terminologies Guidance through the

Agenda • • Introduction and Task Methodology • • Method Structure/Terminologies Guidance through the method Example for Results Conclusions • • Looking Forward Final thoughts on STPA 30/09/2020 2

Introduction and Task Inner Triplet Q 3 Q 2 Q 1 ATLAS CMS Inner

Introduction and Task Inner Triplet Q 3 Q 2 Q 1 ATLAS CMS Inner Triplet Q 1 Q 2 Q 3 Beam 2 Beam 1 home. cern. ch 30/09/2020 3

Introduction and Task • Protection based on: • • • Coupling-Loss Induced Quench (CLIQ)

Introduction and Task • Protection based on: • • • Coupling-Loss Induced Quench (CLIQ) • • • PC CLIQ Quench Heaters Charged Capacitor discharges oscillating current on command to the magnet circuit W Creates losses in the NEcoil Losses heat up the magnet homogeneosly Novel Protection Technology RL 1 C Quench Heaters: • • • Normal conducting strips Implemented in the coils In case of a quench: Capacitor discharges in the resistive Heater Circuit Creates ohmic heat Force-Quenches the magnet 30/09/2020 UC Th. FW Th. BW RL 2 L 1 L 2 L 3 L 4 4

Methodology • • • Method used: System Theoretic Process Analysis (STPA) Built on the

Methodology • • • Method used: System Theoretic Process Analysis (STPA) Built on the STAMP*1) Accident Model: Introduced by Nancy Leveson at MIT 2012 Idea: Accidents happen, when the system is not in a safe state Therefore, the system must be designed to drift back to a safe state in case of a failure Top-Down Approach for system safety 30/09/2020 *1)System Theoretic Accident Model and Processes 5

Method Structure/Terminologies Wished Tire Angle Wished Momentum Hazards + Environmental Conditions = Accident Derive

Method Structure/Terminologies Wished Tire Angle Wished Momentum Hazards + Environmental Conditions = Accident Derive Hazards from accidents Sight Speedometer Gas Pedal Steering Wheel Hazard Derive the System with control actions Driver UCAs lead to hazardous system states System Control Structure Find out how control actions can turn unsafe Unsafe Control Action Find out how unsafe control actions can be provided Provide Changes to Momentum and Angle UCAs are given because of Scenarios and causal factors Velocity Environment Car Vibration Street Level of Detail increasing 30/09/2020 6

Methodology CERN Personnel CCC Operators Beam Interlock System Provide Beam_Info 3 Provide System Status

Methodology CERN Personnel CCC Operators Beam Interlock System Provide Beam_Info 3 Provide System Status Request Beam Dump Provide BIS Status UPS Powering F 3&F 4 2 UPS Powering F 3 Provide PIC Beam Permit UPS Powering F 3&F 4 Powering Interlock System Powering Permit Req. Fast Power Abort 4 UPS Powering for Controls F 3 UPS Powering F 3 Provide QPS Status Provide PC Status Power Converters Provide Power Abort Power Provide Cryogenic Helium Quench Protection System Send Powering Failure (to PIC) Discharge Request Cryo_Start Cryo_Maintain Cryogenic System Experts Feedback Measure Coil Voltages Current Change. . . Fire CLIQ/QH Units Provide Magnetic Focussing Field Inner Triplet Magnets 6 Q 2 Q 1 A B Q 3 C Change Parameters 1 D E LHC Beam

Methodology Hazards + Environmental Conditions = Accident Derive Hazards from accidents Hazard UCAs lead

Methodology Hazards + Environmental Conditions = Accident Derive Hazards from accidents Hazard UCAs lead to hazardous system states System Control Structure Derive the System with control actions Find out how control actions can turn unsafe Unsafe Control Action Find out how unsafe control actions can be provided UCAs are given because of Scenarios and causal factors Level of Detail increasing 30/09/2020 8

Methodology Control Action: CLIQ / QH Units fired when. . . Powering Level Magnet

Methodology Control Action: CLIQ / QH Units fired when. . . Powering Level Magnet Status Beam Presense Hazardous? Notes Flat-Top Quench Not present No Nominal Protection Squeeze No Quench Present Yes, Beam is present! e. g. Spurious firing of single units during squeeze-level powering Injection No Quench Present Yes, Beam is present! e. g. Spurious firing of single units during injection-level powering 30/09/2020 9

Methodology CERN Personnel 1 Spurious Beam Dump Request Beam Interlock System 2 Incorrect Parameters

Methodology CERN Personnel 1 Spurious Beam Dump Request Beam Interlock System 2 Incorrect Parameters provided The PIC does not (in time) request a beam dump 3 Quench Protection System Powering Interlock System CLIQ/QH Units spuriously firing (during injection/squeeze) 4 Cryogenic System Power Converters CLIQ / QH Units not firing when needed Power Abort delayed when requested 5 LHC Beam Inner Triplet Magnets 6 Q 1 Q 2 Q 3 10 A B C D E

Methodology Hazards + Environmental Conditions = Accident Derive Hazards from accidents Hazard UCAs lead

Methodology Hazards + Environmental Conditions = Accident Derive Hazards from accidents Hazard UCAs lead to hazardous system states System Control Structure Derive the System with control actions Find out how control actions can turn unsafe Unsafe Control Action Find out how unsafe control actions can be provided UCAs are given because of Scenarios and causal factors Level of Detail increasing 30/09/2020 11

Examples of some Results CLIQ / QH Units firing spuriously while beam is present

Examples of some Results CLIQ / QH Units firing spuriously while beam is present Scenario PIC does process the opened quench loop delayed Causal Factor Short circuit in the essential circuits OK loop Delay of the Quench Loop Interface opening Requirement/Safety System The Hardware Commissioning Group must verify the operation of the Interlock Chain and find short cuts. Regular Maintenance, triggered by actions or schedule must be defined, to verify the interlock operation again. The auxiliary circuit processes the signal as well, requesting the beam dump with roughly one ms delay. QDS does not interrupt the Quench Loop, but triggers CLIQ and QH Incorrect Parameters are loaded to the QDS CLIQ Units discharge without trigger CLIQ Unit component failure leading to a spurious discharge during operation An interlock must be defined, that detects the spurious discharge and dumps the beam. A dedicated analysis must asses the effect of spurious CLIQ/DQHDS Unit firing on the beam with medium/low current. CERN Personnel damages the connection (…accidentially) CLIQ Units and the connection to the QDS must be protected from accidential interaction with CERN Personnel. QDS-CLIQ connection leading to a firing of the CLIQ Unit Powering Interlock System/PCs Safe parameter space must be defined for the detection Parameters to prevent unsafe parameters. RBAC must be used to limit accessibility to expterts. A dedicated analysis must asses the timings for an automatic parameter check (e. g. all 5 minutes) QDS-CLIQ connection leading to a firing of the CLIQ Unit PIC does process the opened quench loop delayed Quench Detection System QDS does not interrupt the Quench Loop, but triggers CLIQ and QH d. I/dt Sensor Voltage Sensor CLIQ/QH CLIQ Units discharge without trigger Triplet Magnets 12

Where to go from now? • • Overall over 150 causal factors have been

Where to go from now? • • Overall over 150 causal factors have been identified. The result is an excel spreadsheet with unsafe control actions, causal factors and prelimary requirements The requirements will have to be commented by the corresponding system experts Documented in the EDMS Dependability Folder* 30/09/2020 Requirement Expert Comment (Examples) The Hardware Commission must test the interlock loop operation and timing during HWC. A regular maintenance strategy must be defined triggered by schedule or by actions. Will be considered by the HWC. The responsible person for the HWC Procedures must be defined. Safe parameter space must be defined for the detection Parameters to prevent unsafe parameters. RBAC must be used to limit accessibility to expterts. A dedicated analysis must asses the timings for an automatic parameter check (e. g. all 5 minutes) These Requirements are already fulfilled. A dedicated analysis must asses the effect of spurious CLIQ/DQHDS Unit firing on the beam with medium/low current. We are already considering this, it is under investigation by. . . CLIQ Units and the connection to the QDS must be protected from accidential interaction with CERN Personnel. We will fence off the CLIQ to prevent interaction on the CLIQ-QDS connection. *CERN-0000182538 13

Conclusions • • • Early system analyses are useful! For some UCAs, we studied

Conclusions • • • Early system analyses are useful! For some UCAs, we studied possible scenarios and causal factors. The requirements are the result of the analysis and prevent or mitigate the causal factors The final presentation has proven the usefulness of such an analysis Studies like this may not always show blatant system failures, but also encourage people to think publicly about operational safety/availabilty 30/09/2020 Rule of ten Cost Influence Costs/Error R&D Purchase Production Field 14

Final Thoughts on STPA • Great for early design stages Great for analysing complex

Final Thoughts on STPA • Great for early design stages Great for analysing complex systems and human impact Can be stopped at a low level of detail Great system documentation • The classical FMEA can learn from it: • • • Systematically find failure modes based on incorrect timings (Context Tables) To include human errors as failure modes To see the system not only as components 30/09/2020 15

Backup Slides 30/09/2020 16

Backup Slides 30/09/2020 16

30/09/2020 17

30/09/2020 17