Reliability analysis of the LHC quadrupole quench protection
Reliability analysis of the LHC quadrupole quench protection system TE-MPE-TM #117 13/09/2018 Miriam Blumenschein Andrea Apollonio, Reiner Denz, Jelena Spasic, Jens Steckert, Jan Uythoven, Daniel Wollmann 1
Motivation and Overview Motivation: • Upgrade of the 392 quench detection units for the LHC main quadrupole magnets (DQLPU-B) in LS 2 • Upgrade is part of the QPS maintenance plan • No new quench detection functionalities • Enhanced diagnostic functionalities Overview 1. RIRE at the example of the quadrupole quench protection system, step by step 2. Example of a detailed study: the trigger coupling 3. Outlook 2
Objectives and principle and of RIRE Reliability Requirements and Initial Risk Evaluation RIRE 1. Risk matrix 2. Adapted FMEA Accelerator reliability targets System failure behaviour 3. System reliability requirements 4. Risk Evaluation: necessary reliability actions 3
Step 1: Accelerator reliability requirements LHC risk matrix Recovery ∞ year month week day hours minutes S 7 S 6 S 5 S 3 S 2 S 1 S 4 1 / hour Frequency 1 / day 1 / week 1 / month 1 / year 1 / 10 years 1 / 1000 years 4
Step 2 1. Risk matrix 2. Adapted FMEA Accelerator reliability targets System failure behaviour 3. System reliability requirements 2. 1) System context 2. 2) System structure 2. 3) System functions 2. 4) Context dependent functions 2. 5) Failure modes and effects 4. Risk Evaluation: necessary reliability actions 5
2. 1 Context: Powering magnets and Interlock for 1 sector (out of 8) BIS Beam dump request CIRCUIT_QUENCH Even point RQF circuit Current lead Quench Interlock Loop QIL RQD circuit DQHDS 1/2 DQLPU_S open DQLPU_ B open 1/2 DQLPU_S open DQLPU_ B DQGPU-D open, read Diode MQF 1 MQD 2 MQF 47/51 Quench heater MQD 1 MQF 2 MQD 47/51 Diode MBA MBB MBC Quadrupole MQ 2 Discharge loop Quench Interlock Loop DQHDS Diode Quadrupole MQ 1 Circuit quench loop Quadrupole MQ 47/51 Odd point open, read DQHDS Upgrade in LS 2: DQLPU_B Quench Protection Quench Loop Controller DQQLC FPA Open/ Close Switch Resistor Power converter EE RQF Switch Power converter Resistor Energy Extraction System EE RQD PC_FAST_ABORT DISCHARGE_REQUEST PIC 1/2 DQLPU_S open PC_DISCHARGE_REQUEST SC equipment to be protected Beam 2 Beam 1 6
Step 2. 2: System structure 1. Quench Detection QD, n = 51 1. DYPQ Yellow protection rack quadrupole, n = 47/51 End effect 1. Energy extraction, n=1 2. Quench heater, n = 2 * 51 3. Quench interlock loop, n =1 Immediate effect 1. Quadrupole 2. Beam operation (beam dump, injection) 7
UPS 2 UPS 1 DYPQ MQF Interlock OUT Interlock IN Voltage tap ext_B Expert tool Voltage tap int_B Voltage tap ext_A Voltage tap int_A Reset, change configuration Win. CC supervision Logging Reset, simple commands Logging, PM DQHDS interlock QPS_OK Step 2. 3: System functions MQF + MQD DQHDS trigger DYPB-S MQD 8
Step 2. 4: Context dependent functions Quench Normal operation ~4800 h/a Switching off/ on Capacitor bank charged (810 V) Commissioning Post quench I (trigger latched) ~5 -10 min Sending post mortem data Quench event analysis ~ [h] OK: Reset Revalidation detection board Post quench II not OK OK (trigger unlatched) Maintenance, ~10 min repair, tests 9
Step 2. 4: Context dependent functions • Open quench interlock loop • Discharge quench • Keep quench heater power supply Post quench I interlock loop closed • … (trigger latched) • Keep quench heater ~5 -10 min power supply • Keep quench interlock loop Quench charged opened • … • Keep quench heater power Normal operation supply latched ~4800 h/a • … 10
FMEA report on EDMS (2010822) Step 2. 5: Failure modes and effects FMEA black box level: quench detection system Context Normal operation (~4800 h/a) Asymmetric quench Function Keep quench interlock loop closed Open quench loop ID failure mode OP. 1 AQ. 1 Failure mode Quench interlock loop opened 1 oo 2 Quench interlock loop or 2 oo 2 not opened 1 oo 2 Immediate False energy extraction, … effect no firing of the quench heaters, false circuit quench interlock End effect False beam dump … Severity of EE 2 … Detection Quench interlock loop monitoring … Method indicates loop status … … … … 11
Step 2. 5: Summary failure effects Quadrupole • DYPQ_EE 1: False quenching, S 2 (hours) • DYPQ_EE 2: Quadrupole damaged, S 5 (month) Beam operation • DYPQ_EE 3: Injection delayed, S 2 (hours) • DYPQ_EE 4: False beam dump, S 2 (hours) • DYPQ_EE 5: Missing abort trigger by DYPQ, beam dump by another protection system: • n. QPS works: S 3 (days) analysis time • n. QPS does not work, BLM work: S 5 (month) quadrupole damaged 12
Step 3 1. Risk matrix 2. Adapted FMEA Accelerator reliability targets System failure behavior 3. System reliability requirements 4. Risk Evaluation: necessary reliability actions The accelerator targets are allocated to the end effects 13
Step 3: Reliability targets for failure effects Recovery LHC risk matrix ∞ year month week day hours minutes 1 / hour Frequency 1 / day 1 / week 1 / month 1 / year 1 / 10 years EE 1, EE 3, EE 4 EE 5 • Recovery 1 / 100 time yearsincludes the time needed EE 2 for maintenance or intervention and the time to bring the LHC back to the state at which the failure occurred 1 / 1000 years 14
Step 4 1. Risk matrix 2. Adapted FMEA Accelerator reliability targets System failure behavior 3. System reliability requirements 4. Risk Evaluation: necessary reliability actions Purpose: • Estimate the necessary extent of reliability actions 15
Step 4: Risk evaluation Necessary extent of reliability actions is estimated: • Severity: S 3 (day) – S 7 (infinite) • 1 end effect in severity category 5 • 1 end effect in severity category 3 • Undetectable: • 6 failure modes: recommended actions to improve detectability 16
Step 4: Visualization of the FEMA table Severity categories End effects Failure modes One contributor: Trigger circuit – missing trigger DQHDS 36 inputs Reliability modelling Fault Tree Report on EDMS (2010822) 17
MQF MQD 2. Detailed study – trigger coupling Trigger circuit is contributor to: • FM: heater series is not fired 2 oo 2 • EE: Quadrupole damage (S 5 – month) 18
2. Trigger coupling - Analysis techniques • Failure rate prediction according to the handbook 217 Plus for the estimation of occurrence probabilities of electronic components • Inductive FMECA according to IEC 60812 for single failure analysis • Quantitative fault tree analysis according to IEC 61025 for multiple failure analysis • Supported by software Isograph 19
2. Trigger coupling - Results Objective 1: Compare design alternatives: Chosen design: • trigger coupling with diode in single configuration, • 4 DQQDL diodes, • 1 DQCSU resistors, • no cross triggering Objective 2: Weaknesses in the DYPQ trigger circuit? • Single DQHDS entry Objective 3: Estimate DYPQ trigger circuit reliability • The probability that for one of the 392 quadrupoles two out of two heaters are not fired within 100 years is estimated to be 0. 1 %. Documentation on EDMS (2010831) 20
2. Trigger coupling - Results Recovery LHC risk matrix ∞ year month week day hours minutes 1 / hour Frequency 1 / day 1 / week DYPQ_EE 2: Quadrupole damaged, S 5 (month) • Due to trigger circuit: 0. 1% 1 / month 1 / year 1 / 10 years EE 1, EE 3, EE 4 EE 5 • Recovery 1 / 100 time yearsincludes the time needed EE 2 for maintenance or intervention and the time to bring the LHC back to the state at which the failure occurred 1 / 1000 years 21
Summary and conclusion • RIRE is a CERN tailored methodology for the experience based derivation of quantitative reliability targets • RIRE was applied to the upgraded quadrupole quench protection system DYPQ, a complex systems with context dependent functions • Several critical failure modes identified; qualitative study affected design and procedures • The usefulness of quantitative reliability study on board level were identified (contributors to S 5 and S 3 failure modes) • The trigger link, a contributor to S 5 was studied on component level, result: chosen design acceptable • Some S 3 – S 7 failure modes remain to be studied to fully qualify the system 22
23
Step 2. 5: LHC severity table Severity level 7 Catastrophic: 6 Very high: 5 High: Recovery time Infinite Year Month 4 3 2 1 Week Days Hours Minutes Critical: Major: Moderate: Low: 24
1. 2 DQLPU-B Upgrade in LS 2 After upgrade Current 1. DYPQ Yellow Protection rack Quadrupole, n = 47/51 1. DQLPU-B Local Protection Unit type B (i. QPS) n = 1 1. DQQDL Quench Detection Local, n = 4 2. DQAMC Acquisition and Monitoring Controller, n = 1 3. SYKO Power Supply, n = 2 2. DQHDS Heaters Discharge power Supply, n = 2 3. Crawford box, n = 1 4. Dispatching box, n = 1 1. DYPQ Yellow Protection rack Quadrupole, n = 47/51 1. DQLPU-B II Local Protection Unit type B (i. QPS) n = 1 1. DQQDL: new board, new features 2. DQAMC: minor upgrade (different firmware) 3. DQLPR (dipole), n = 2 2. DQHDS: minor upgrade (fuse to earth) 3. Crawford box DQLIM (dipole) 4. Dispatching box 5. System board controller: PS monitoring, quench heater supervision, triggering and timing controller 25
- Slides: 25