NASA OSMA SAS 02 Software Reliability Modeling Traditional
NASA OSMA SAS '02 Software Reliability Modeling: Traditional and Non-Parametric Dolores R. Wallace Victor Laing SRS Information Services Software Assurance Technology Center http: //satc. gsfc. nasa. gov/ dwallac, vlaing@pop 300. gsfc. nasa. gov NASA OSMA SAS 02
The Problem • Critical NASA systems must execute successfully for a specified time under specified conditions -Reliability • Most systems rely on software • Hence, a means to measure software reliability is essential to determining readiness for operation • Software reliability modeling provides one data point for reliability measurement NASA OSMA SAS 02
Software Reliability Modeling (SRM) – Traditional • Captures hardware reliability engineering concepts • Mathematically models behavior of a software system from failure data to predict reliability growth • Invokes curve-fitting techniques to determine values of parameters used in the models • Validates models with data with statistical analysis • Using parametric values, predicts future measurements, e. g. , – Mean time to failure – Total number faults remaining – Number faults at time t NASA OSMA SAS 02
Synopsis • FY 01 – Identify mathematics of hardware reliability not used in software – Identify differences between hardware, software affecting reliability measurement – Identify possible improvements • FY 02 – Demonstrate practicality of SRM at GSFC – Fault correction improvement – Schneidewind – Non-parametric model - Laing NASA OSMA SAS 02
SRM: Data Collection • Resistance to data collection • Data content – Accuracy of content – Dates of failure, correction – Calendar time not execution time – Activities/ phase when failures occur • Data manipulation – Frequency counts – Interval size and length – Time between failure NASA OSMA SAS 02
Interval. Counter Sample had 35 weeks – simplified fault count NASA OSMA SAS 02 6
SMERFS^3 3 -D OUTPUT NASA OSMA SAS 02 7
Practical Method • SATC Services – SATC executes models and prepares analysis – SATC provides training and public domain tool • Improvements – Recommendations to projects for data collection – Interval. Counter to simplify data manipulation NASA OSMA SAS 02
Fault Correction Adjustments • Reliability growth occurs from fault correction • Failure correction proportional to rate of failure detection • Adjusted model with delay d. T (based on queuing service) but same general form as faults detected at time T • Process: use SMERFS Schneidewind model to get parameters; apply to revised model via spreadsheet • Results – Show reliability growth due to fault correction – Predict stopping rules for testing NASA OSMA SAS 02
SMERFS^3 – Excel Approach* • Best approach: combine SMERFS^3 with Excel. • SRT provides model parameter estimation. • Copy and paste parameters from SRT into spreadsheet. • Excel extends capabilities of SRT by allowing user to provide equations, statistical analysis, and plots. * CASRE or other software reliability modeling tool may be used with EXCEL Recommended approach until the SRM tools incorporate this new model. NASA OSMA SAS 02
Non-parametric Reliability Modeling • Hardware - Wears out over time - Increasing failure rate • Software - Do not wear over time - Decreasing failure rate NASA OSMA SAS 02
Continued • Hardware Reliability Modeling - “Large” independent random sampling - Model reliability - Make predictions • Software Reliability Modeling - “Small” observed dependent sample (of size one? ) - Not based on independent random sampling - Model reliability - Make predictions? Do we search for the silver bullet of SWR models? NASA OSMA SAS 02
Reliability Trending • Hardware Reliability 100% Maximum 0% 0 Minimum 1 2 3 4 Time … • Software Reliability 100% Maximum 0% Minimum 0 NASA OSMA SAS 02 1 2 3 4 Time …
Software Reliability Bounds 100% Maximum Estimated Bound Estimated Model 0% 0 NASA OSMA SAS 02 Minimum 1 2 3 4 Time …
Calculation of Estimated Models and Bounds • Dynamic Metrics - Failure rate data - Problem reports • Static Code Metrics - Traditional - Source Lines of Code (SLOC) - Cyclomatic Complexity (CC) - Comment Percentage (CP) - Object-Oriented - Coupling Between Objects (CBO) - Depth of Inheritance Tree (DIT) - Weighted Methods per Class (WMC) NASA OSMA SAS 02
Combining Dynamic and Static Metrics • The Proportional Hazards Model (PHM) PHM Non-Parametric Component (Static) R(t|z) = {R 0(t)}g(z) Parametric Component (Dynamic) - Where zβ = z 1β 1 + z 2β 2 + … + zpβp , βi’s are unknown regression coefficients and zi’s are static code metrics data NASA OSMA SAS 02
Tool Schema Input Data z = (z 1, z 2, … zp) Database Data Processing R(t|z) = {R 0(t)}g(z) Weighted Average Output Data Observed Data Raw Data Estimated Model Estimated Bound - Process Below Bounds Action - Corrective Action - Process Above Bounds - No Corrective Action NASA OSMA SAS 02
SUMMARY • Software reliability modeling – Provides useful measurements for decisions – Does not require expert knowledge of the math! – Is relatively easy with use of software tools • Fault correction improvement – Adapts model to be more like software – Demonstrates combined use of traditional SRM tools with spreadsheet technology • Non-parametric modeling – New approach shows promise – Prototype to be expanded NASA OSMA SAS 02
AIAA Recommended Steps (specific to SRM) • • Characterizing the environment Determining test approach Selecting models Collecting data Estimating parameters Validating the models Performing analysis NASA OSMA SAS 02
Fault Correction Modeling • Software reliability models focus on modeling and predicting failure occurrence – There has not been equal priority on modeling the fault correction process. • Fault correction modeling and prediction support to – – predict whether reliability goals have been achieved develop stopping rules for testing formulate test strategies rationally allocate test resources. NASA OSMA SAS 02
Equations: Prediction and Comparison Worksheets Time to Next Failure(s) Predicted at Time t Remaining Failures Predicted at Time t: r(t) = ( / ) – Xs, t Cumulative Number of Failures Detected at Time T: D(T) = (α/β)[1 – exp (-β ((T –s + 1)))] + Xs-1 Cumulative Number of Failures Detected Over Life of Software TL: D(TL) = / + Xs-1 Equations developed by Dr. Norman Schneidewind, Naval Postgraduate School, Monterey, CA NASA OSMA SAS 02 21
- Slides: 21