Intro Reliability Growth Approved for public release distribution

Intro Reliability Growth "Approved for public release; distribution unlimited. Review completed by the AMRDEC Public Affairs Office 11 Oct 2013; PR 0073. " Presented by: Mark E. Sims Reliability S&T Engineer UNCLASSIFIED Aviation and Missile Research, Development and Engineering Center

Mil-HDBK-189 Definition Reliability Growth The positive improvement in a reliability parameter over a period of time due to changes in product design or the manufacturing process. MIL-HDBK-189 is a Department of Army Handbook for Reliability Growth Management 2

Beginnings J. T. Duane was an engineer at the Aerospace Electronics Department of the General Electric Company. He published a paper in 1964 that applied a “learning curve approach” to reliability monitoring. He observed that the cumulative MTBF versus cumulative operating time followed a straight line when plotted on log-log paper. The learning (i. e. , growing) is accomplished through a “test, analyze, and fix” (TAAF) process. Design Test Identified Deficiencies Failure Analysis 3

Graphs Reliability Growth Chart 100 MTBF . 10 . 1 1 . 10 log-log paper graphing Duane Postulate: The cumulative MTBF versus cumulative operating time is a straight line on log-log paper. Cumulative Duane . . Instantaneous Duane 100 Test Hours 10000 Reliability Growth Chart 120 100 80 Normal graphing MTBF 1000 . 60 40 . . 20 . Cumulative Duane Instantaneous Duane . 0 0 500 1000 Test Hours 1500 2000 4

Continuous Growth Reliability Growth Chart 0. 009 Failure Rate 0. 008 Continuous means time. 0. 007 0. 006 0. 005 0. 004 0. 003 0. 002 0. 001 0. 000 0 1000 2000 3000 4000 Reliability Growth Chart 5000 Test Hours 700 600 500 MTBF You can plot failure rate or MTBF against the total test hours. 400 300 200 100 0 0 1000 2000 3000 4000 5000 Test Hours 5

Discrete Growth Reliability Chart 95. 0% 90. 0% Reliability 85. 0% Discrete means trials. 80. 0% 75. 0% 70. 0% 65. 0% 60. 0% 0 50 100 150 200 Trials 6

Discrete Growth Reliability Chart 95. 0% 90. 0% Reliability 85. 0% Reliability Growth follows a Learning Curve approach. 80. 0% 75. 0% 70. 0% Note: More rapid growth occurs earlier in the process then flattens out! 65. 0% 60. 0% 0 50 100 150 200 Trials 7

Why Reliability Growth? 8

Example A System has 18 Failures in 177 Trials 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 9

Example A system has 18 failures in 177 trials. The failures are listed the tables below. Failure Trial 1 6 10 55 2 7 11 64 3 14 12 71 4 16 13 79 5 26 14 98 6 30 15 108 7 38 16 129 8 39 17 145 9 51 18 148 10

Example There appears to be reliability growth. Failure Time Trials Between Failures Failure Trials Between Failures 1 6 6 10 55 4 2 7 1 11 64 9 3 14 7 12 71 7 4 16 2 13 79 8 5 26 10 14 98 19 6 30 4 15 108 10 7 38 8 16 129 21 8 39 1 17 145 16 9 51 12 18 148 3 Less trials between failures. More trials between failures. 11

Example Applying Reliability Growth Methodology, we get the following curve: 1 . 0. 9254 0. 8 Reliability 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 0 0 20 40 60 80 100 Trials 120 140 160 180 200 12

Example Applying Reliability Growth Methodology, we get the following curve: 1 . 0. 9254 0. 8 Reliability 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 Note: Reliability without applying growth is 1 – (18 / 177) = 0. 8983 0. 1 0 0 20 40 60 80 100 Trials 120 140 160 180 200 13

Why Reliability Growth? Saves Assets Reduces Test Time Saves $$$$$$$ 14

Duane Model Power Law Formulation for reliability growth 15

Duane Postulate During Reliability Growth, Graphing the log of time (or tests) against its corresponding log of MTBF Will be a straight line with slope α. MTBFCum = Cumulative Mean-Time-Between-Failure t = Time K = Constant for Power Law Equation α = Growth parameter 16

Duane Postulate Times MTBFCum t 1 M 1 t 2 M 2 t 3 MTBF (Ln(t 3), Ln(M 3)) Slope, α (Ln(t 2), Ln(M 2)) (Ln(t 1), Ln(M 1)) Time (or Trial), t 17

Duane Postulate Linear relationship: y = αx + b Has a linear log-log relationship! 18

Calculating α the growth rate 19

Calculating α (the growth rate) Time (hrs) Total Failures First reading 500 5 Last reading 4000 20 We will determine α from these two readings. 20

Calculating α (the growth rate) Time (hrs) Total Failures First reading 500 5 Last reading 4000 20 We will determine α from these two readings. 21

Calculating α (the growth rate) Time (hrs) Total Failures MTBF First reading 500 5 100 Last reading 4000 20 200 First calculate the cumulative MTBF for each reading. 22

Calculating α (the growth rate) Time (hrs) Total Failures MTBF First reading 500 5 Last reading 4000 20 Ln(Time) Ln(MTBF) 100 Ln(500) Ln(100) 200 Ln(4000) Ln(200) Take logs of the readings. 23

Calculating α (the growth rate) Time (hrs) Total Failures MTBF First reading 500 5 Last reading 4000 20 y-axis Slope, α Ln(Time) Ln(MTBF) 100 Ln(500) Ln(100) 200 Ln(4000) Ln(200) ( Ln(4000) , Ln(200) ) Plot the logs of the readings. ( Ln(500) , Ln(100) ) x-axis 24

Calculating α (the growth rate) Total Failures MTBF First reading 500 5 Last reading 4000 20 y-axis Time (hrs) Ln(Time) Ln(MTBF) 100 Ln(500) Ln(100) 200 Ln(4000) Ln(200) ( 8. 294 , 5. 298 ) α = 0. 33 ( 6. 215 , 4. 605 ) x-axis 25

Calculating α (the growth rate) y-axis Growth is indicated when 0 < α < 1 α = 0. 33 x-axis 26

Duane Parameters α = Growth parameter TI = Initial test time MI = Initial MTBF MF = Final MTBF Ttotal = Total time • These parameters go into the Duane equation. • If you know 4 of the parameters, you can calculate the other. 27

Sensitivity of α What is the Total Test time if we are given these 4 parameters? α . 40 TI 100 MI 50 MF 150 TTotal ? α = Growth parameter TI = Initial test time MI = Initial MTBF MF = Final MTBF Ttotal = Total time 28

Sensitivity of α How does changing the growth parameter α affect the total test time? α . 40 TI 100 MI 50 MF 150 Ttotal 435 α = Growth parameter TI = Initial test time MI = Initial MTBF MF = Final MTBF Ttotal = Total time 29

Sensitivity of α How does changing the growth parameter α affect the total test time? α . 40 . 27 . 46 . 64 TI 100 100 MI 50 50 MF 150 150 Ttotal 435 α = Growth parameter TI = Initial test time MI = Initial MTBF MF = Final MTBF Ttotal = Total time 30

Sensitivity of α How does changing the growth parameter α affect the total test time? α . 40 . 27 . 46 . 64 TI 100 100 MI 50 50 MF 150 150 Ttotal 435 1823 285 113 The α is very sensitive to the Total Time! α = Growth parameter TI = Initial test time MI = Initial MTBF MF = Final MTBF Ttotal = Total time 31

Duane MTBF Equation Instantaneous vs Cumulative Finding the true estimate of a system’s MTBF using reliability growth. 32

Inst vs. Cum MTBF Failure Number Failure Time 1 10 2 40 3 90 4 160 5 250 What is the true estimate of the MTBF at 250 hours? 33

Inst vs. Cum MTBF Failure Number Failure Time MTBFCum 1 10 10 2 40 20 3 90 30 4 160 40 5 250 50 Is the MTBF 50 at time 250? 34

Inst vs. Cum MTBF Failure Number Failure Time MTBFCum Time Between Failures 1 10 10 10 2 40 20 30 3 90 30 50 4 160 40 70 5 250 50 90 Or would you say the MTBF is 90 at 250 hours? 35

Inst vs. Cum MTBFCum Time Between Failures MTBFInst 10 10 10 31 2 40 20 30 43 3 90 30 50 52 4 160 40 70 59 5 250 50 90 66 Failure Number Failure Time 1 Applying a Reliability Growth Tracking Model from AMSAA or Relia. Soft’s RGA software tool will give these numbers. 36

Inst vs. Cum MTBFCum Time Between Failures MTBFInst 10 10 10 31 2 40 20 30 43 3 90 30 50 52 4 160 40 70 59 5 250 50 90 66 Failure Number Failure Time 1 Applying a Reliability Growth Tracking Model from AMSAA or Relia. Soft’s RGA software tool to get these numbers. So, 66 is the true MTBF at 250 operating hours, if reliability growth is occurring. 37

Inst vs. Cum MTBF On Log-Log Graph Paper 100 MTBFInst 10 MTBFCum 10 1000 10, 000 Time (or Test), t 38

Inst vs. Cum MTBF This is how the graphs look In standard Cartesian coordinate MTBF 300 MTBFInst 200 100 MTBFCum 500 1000 1500 2000 Time (or Test), t 39

Exercise 10 system failures occurred after 500 hours of reliability growth testing, with a calculated growth parameter of 0. 40. What is the system’s instantaneous MTBF? 40

Exercise 10 system failures occurred after 500 hours of reliability growth testing, with a calculated growth parameter of 0. 40. What is the system’s instantaneous MTBF? 41

Exercise 10 system failures occurred after 500 hours of reliability growth testing, with a calculated growth parameter of 0. 40. What is the system’s instantaneous MTBF? 42

Reliability Growth Formulas Failure Rate MTBF Reliability 43

M(t) = 1 / r(t) MTBF is the reciprocal of the failure rate. 44

Failure Rate Formula r. I = Initial failure rate Initial Conditions t. I = Initial time corresponding to r. I α = Growth rate parameter 45

MTBF Formula MI = Initial MTBF Initial Conditions t. I = Initial time corresponding to MI α = Growth rate parameter 46

Reliability (Discrete) RI = Initial Reliability Initial Conditions NI = Initial number of trials corresponding to RI α = Growth rate parameter 47

Deriving r(t) Formula r(t) is sometimes called the Hazard Rate. 48

Deriving r(t) Formula First, start with the Duane Postulate. K = Constant for Power Law Equation 49

Deriving r(t) Formula Insert initial conditions MI at TI , and solve for K. t. I is the Initial Test Time. MI is the Initial MTBF at time t. I. 50

Deriving r(t) Formula Now substitute for K. 51

Deriving r(t) Formula The failure rate, r, is the inverse of the MTBF, so r(t) = 1 / M(t). 52

Deriving r(t) Formula Now we will simplify and take the derivative. 53

Deriving r(t) Formula Now we will simplify and take the derivative. 54

Deriving M(t) Formula MI = Initial MTBF t. I = Initial time corresponding to MI α = Growth rate parameter 55

Deriving M(t) Formula Recall MTBF = 1/r, so take the inverse of r(t). 56

The Sensitivity of Duane’s Initial Conditions TI and MI on the Total Test Time. 57

Sensitivity of Initial Time What if we increase the initial time for a planning curve? TI 100 150 200 250 α . 40 MI 50 50 MF 150 150 Ttotal 435 ? ? ? α = Growth parameter TI = Initial test time MI = Initial MTBF MF = Final MTBF Ttotal = Total time 58

Sensitivity of Initial Time What if we increase the initial time for a planning curve? TI 100 150 200 250 α . 40 MI 50 50 MF 150 150 Ttotal 435 652 869 1087 A higher initial time significantly increases Ttotal! Why? 59

Sensitivity of Initial Time TI 100 250 α . 40 MI 50 50 MF 150 Ttotal 435 1087 MTBF 150 100 50 TI 250 500 750 1000 Time 60

Sensitivity of Initial Time Growth is more rapid the smaller TI is! TI 100 250 α . 40 MI 50 50 MF 150 Ttotal 435 1087 MTBF 150 100 50 TI 250 500 750 1000 Time 61

Sensitivity of Initial MTBF What if we change the initial MTBF for a planning curve? MI 50 25 70 85 α . 40 TI 100 100 MF 150 150 Ttotal 435 ? ? ? α = Growth parameter TI = Initial test time MI = Initial MTBF MF = Final MTBF Ttotal = Total time 62

Sensitivity of Initial MTBF What if we change the initial MTBF for a planning curve? MI 50 25 70 85 α . 40 TI 100 100 MF 150 150 Ttotal 435 2459 187 115 A higher initial MTBF significantly decreases Ttotal! 63

Deriving Reliability Formula RI = Initial Reliability NI = Initial number of trials corresponding to RI α = Growth rate parameter 64

Deriving Reliability Formula r is the failure rate. Rcum = Cumulative Reliability F = Number of Failures N = Number of Trials 65

Deriving Reliability Formula Recall failure rate formula. 66

Deriving Reliability Formula Subtract from 1. 67

Deriving Reliability Formula Make substitutions. 68

Exercise Initially, System A has 3 failures after 100 firings. If you expect a growth rate of 0. 25, what would be the expected reliability after 1000 flight tests? 69

Exercise Initially, System A has 3 failures after 100 firings. If you expect a growth rate of 0. 25, what would be the expected reliability after 1000 flight tests? 70

Inst / Cum Conversions 71

AMSAA-Crow Model Projection Method for reliability growth planning 72

Discrete PM 2 Growth Plan Example PM 2 -Discrete Reliability Growth Planning Curve Idealized Curve DT 1 DT 2 DT 3 LUT IOT Requirement 1. 00 RG Potential RGP = 0. 9747 0. 98 RG = 0. 9639 Reliability 0. 96 CAP 3 0. 94 CAP 4 RLUT = 0. 9568 RDT 3 = 0. 9455 RR = 0. 9200 CAP 2 0. 92 RDT 2 = 0. 9260 0. 90 CAP 1 RDT 1 = 0. 8987 0. 88 0. 86 0. 84 - 50 100 150 200 250 300 Trials 041712 -Sims-Reliability Growth (TE Class) 73

Continuous PM 2 Growth Plan Example PM 2 Continuous Reliability Growth Planning Curve Idealized Curve Hypothetical Last Step IOT Requirement MGP = 782 700 MG, DT = 581 600 LUT 500 DT 3 MTBF 400 DT 2 300 200 CAP 4 500 MG, 0 T = 523 CAP 3 415 CAP 2 322 DT 1 CAP 1 MR = 200 MI = 190 100 00 4, 5 00 4, 0 00 3, 5 00 3, 0 00 2, 5 00 2, 0 00 1, 5 00 1, 0 0 50 - 0 Test Time (hours) 041712 -Sims-Reliability Growth (TE Class) 74

Continuous Curve Equation Continuous curve is plotted using this equation. MTBF(T) = System Mean-Time-Between-Failures at time T MTBFI = Initial MTBF MS = Management Strategy µ = Average Fix Effectiveness Factor (FEF) β = Shape parameter 75

Discrete Curve Equation Discrete curve is plotted using this equation. R(N) = System Reliability at trial N. RA = The portion of the system reliability not impacted by the correction action effort RB = The portion of the system reliability addressed by the correction action effort MS = Management Strategy µ = Average Fix Effectiveness Factor (FEF) n = Shape parameter of the beta distribution representing pseudo trials 76

Management Strategy Factor Management Strategy (MS) is the fraction of the overall system failure rate to be address by the corrective action plan. For various reasons (prohibitive cost, improbability of reoccurrence), some failure modes will not have a corrective action. λ = Failure rate. 77

Management Strategy Factor Failure Rates A-Mode B-Mode A-Mode: Failures that are not fixed. B-Mode: Failures that will have a fix. A “fix” means a reliability improvement corrective action, not just a remove and replace of the same component. 78

Management Strategy Factor Failure Rates A-Mode B-Mode λA = Failure rate of A-modes λB = Failure rate of B-modes λA + λB = Overall system failure rate 79

Management Strategy Factor Failure Rates A-Mode B-Mode Example: What is the MS here? Failure mode rate Mode Type 1 0. 027 B 2 0. 015 B 3 0. 033 B 4 0. 001 A 5 0. 013 B 80

Management Strategy Factor Failure Rates A-Mode B-Mode Example: What is the MS here? Failure mode rate Mode Type 1 0. 027 B 2 0. 015 B 3 0. 033 B 4 0. 001 A 5 0. 013 B Total B -modes 0. 088 Total System 0. 089 81

μ, Fix Effectiveness Factor Mil-HDBK-189 Definition: Fix Effectiveness Factor, μ = A fraction representing the reduction in an individual initial mode failure rate due to implementation of a corrective action. Essentially Fix Effectiveness Factors discount failures. A couple examples will follow. 82

μ, Fix Effectiveness Factor Number of tests = 20 Successful tests = 18 What is the reliability? Software Failure X Hardware Failure X 83

μ, Fix Effectiveness Factor Number of tests = 20 Successful tests = 18 Software Failure X Hardware Failure X 84

μ, Fix Effectiveness Factor Number of tests = 20 Successful tests = 18 What is the updated reliability? Software Failure X μ 1 = 100% Hardware Failure X μ 2 = 75% 85

μ, Fix Effectiveness Factor Number of tests = 20 Successful tests = 18 Software X 100% Fix Hardware X 75% Fix 86

μ, Fix Effectiveness Factor Another Example: Say the average μ is 0. 75 (or 75%). What is the updated System Failure Rate? Failure mode rate Mode Type 1 0. 027 B 2 0. 015 B 3 0. 033 B 4 0. 001 A 5 0. 013 B λA = 0. 001 λB = 0. 088 λSystem = 0. 089 87

μ, Fix Effectiveness Factor Another Example: Say the average μ is 0. 75 (or 75%). What is the updated System Failure Rate? Failure mode rate Mode Type 1 0. 027 B 2 0. 015 B 3 0. 033 B 4 0. 001 A 5 0. 013 B Original λA = 0. 001 λB = 0. 088 λSystem = 0. 089 Updated λA = 0. 001 λB = 0. 088 * (1 - 0. 75) = 0. 022 λSystem = 0. 023 88

Shape Parameter, β β = Shape parameter TT = Total Test Time MG = MTBF Goal MGP = MTBF Growth Potential MI = Initial MTBF 89

Shape Parameter, β η = Shape parameter of the beta distribution representing pseudo trials NT = Total Number of Trials RG = Reliability Goal RGP = Reliability Growth Potential RI = Initial Reliability 90

Growth Potential MGP = MTBF Growth Potential The theoretical upper limit on MTBF 91

Growth Potential MGP = MTBF Growth Potential The theoretical upper limit on MTBF For example: MS = 0. 95 μ = 0. 80 MI = 190 92

PM 2 Curve Equation RA = The portion of the system reliability not impacted by the correction action effort MS = Management Strategy. Fraction of failures to be addressed by corrective action. Medium Risk Range 0. 90 – 0. 96. RI = Initial Reliability 93

PM 2 Curve Equation RB = The portion of the system reliability addressed by the correction action effort MS = Management Strategy. Fraction of failures to be addressed by corrective action. Medium Risk Range 0. 90 – 0. 96. RI = Initial Reliability 94

Management Strategy Factor Fraction of failures to be addressed by the corrective action plan. A-Mode: Failures that are not fixed. B-Mode: Failures that will have a fix. λ = Failure rate. 95

PM 2 Growth Plan n = Shape parameter of the beta distribution representing pseudo trials RGP = Reliability Growth Potential RG = Reliability Goal (to meet requirement) NT = Total trials before going into IOT phase 96

Reliability Growth Potential RGP = Reliability Growth Potential The theoretical upper limit on system reliability 97

Reliability Growth Potential RGP = Reliability Growth Potential The theoretical upper limit on system reliability For example: MS = 0. 95 μ = 0. 80 MI = 190 98

Summary • Reliability Growth applies a “Learning Curve” Approach • System must undergo Test-Analyze-And-Fix for reliability to grow. • Initial Conditions are sensitive to a growth plan. 99

ASMSA-Crow/Duane Equations 1. Single Shot Systems Expected Failures: 2. Continuously Operating Systems Expected Failures: 100
- Slides: 100