http www syslog comjwilsonpicsilikekurios 119 jpg TEKNILLINEN KORKEAKOULU

  • Slides: 16
Download presentation
http: //www. syslog. com/~jwilson/pics-i-like/kurios 119. jpg TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut.

http: //www. syslog. com/~jwilson/pics-i-like/kurios 119. jpg TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

ball@mitre. ARPA <Dan Ball> Thu, 08 Jan 87 11: 29: 37 -0500 […] I

ball@mitre. ARPA <Dan Ball> Thu, 08 Jan 87 11: 29: 37 -0500 […] I question whether assigning a monetary value to human life would provide additional insight into the management of risks. I am not convinced that we know how to predict risks, particularly unlikely ones, with any degree of confidence. I would hate to see a $500 K engineering change traded off against a loss of 400 lives @ $1 M with a 10 E-9 expected probability. I'm afraid reducing the problem to dollars could tend to obscure the real issues. Moreover, even if the analyses were performed correctly, the results could be socially unacceptable. I suspect that in the case of a spacecraft, or even a military aircraft, the monetary value of the crew's lives would be insignificant in comparison with other program costs, even with a relatively high hazard probability. In the case of automobile recalls, where the sample size is much larger, the manufacturers may already be trading off the cost of a recall against the expected cost of resulting lawsuits, although I hope not. http: //catless. ncl. ac. uk/Risks/4. 38. html#subj 7 TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Making it work TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Making it work TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Failures • Catastrophic – Serious consequences • Major – Incorrect operation – Possibly recoverable

Failures • Catastrophic – Serious consequences • Major – Incorrect operation – Possibly recoverable • Minor – Inconvenience • Not noticed TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Fault Tolerance Steps 1/3 • Fault Detection – The process of determining that a

Fault Tolerance Steps 1/3 • Fault Detection – The process of determining that a fault has occurred • Diagnosis – The process of determining what caused the fault, or exactly which subsystem or component is faulty • Containment – The process that prevents the propagation of faults from their origin at one point in a system to a point where it can have an effect on the service to the user Source: http: //hissa. ncsl. nist. gov/chissa/SEI_Framework/framework_16. html TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Fault Tolerance Steps 2/3 • Masking – The process of insuring that only correct

Fault Tolerance Steps 2/3 • Masking – The process of insuring that only correct values get passed to the system boundary in spite of a failed component. • Compensation – If a fault occurs and is confined to a subsystem, it may be necessary for the system to provide a response to compensate for output of the faulty subsystem. Source: http: //hissa. ncsl. nist. gov/chissa/SEI_Framework/framework_16. html TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Fault Tolerance Steps 3/3 • Repair – The process in which faults are removed

Fault Tolerance Steps 3/3 • Repair – The process in which faults are removed from a system. In well-designed fault tolerant systems, faults are contained before they propagate to the extent that the delivery of system service is affected. This leaves a portion of the system unusable because of residual faults. If subsequent faults occur, the system may be unable to cope because of this loss of resources, unless these resources are reclaimed through a recovery process which insures that no faults remain in system resources or in the system state. Source: http: //hissa. ncsl. nist. gov/chissa/SEI_Framework/framework_16. html TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Buzzwords • • Fault Tolerance Robust Computing Fail-Safe Intrinsically safe TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY

Buzzwords • • Fault Tolerance Robust Computing Fail-Safe Intrinsically safe TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Mechanisms • Defensive Design – Prevent faults in the first place • Fault tolerance/Robustness

Mechanisms • Defensive Design – Prevent faults in the first place • Fault tolerance/Robustness – Can operate in an imperfect situation • Fail-Safe – Limit the consequences of a failure TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Redundancy • Design the system with multiple instances of critical units in such a

Redundancy • Design the system with multiple instances of critical units in such a manner that the failure of some of these units does not directly fail the entire system. – No single point of failure TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Limits • When a range of values is physically possible, use a subset for

Limits • When a range of values is physically possible, use a subset for safety – Soft • Indicator when recommended values are exceeded – Hard • for use when exceeding the limits would damage the system TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Interlocks • Mechanical – one part cannot move until another does • Software –

Interlocks • Mechanical – one part cannot move until another does • Software – semaphores TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Sanity checks • A mechanism for the system to ensure correct operation • Related

Sanity checks • A mechanism for the system to ensure correct operation • Related to Interlocks and Limits • ’does this make sense here’ TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Safe start-up and shutdown • When electronic devices are activated, they are by nature

Safe start-up and shutdown • When electronic devices are activated, they are by nature in a random state until forced into a desired state • Be proactive and make sure instead of just assuming things to be as needed TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Calibration • Factory calibration is useful only for a limited time • Instruments drift

Calibration • Factory calibration is useful only for a limited time • Instruments drift due to: – temperature – loading – pressure – age • Self calibration – useful in a controlled fashion TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0

Testing • • You You cannot test enough can test too much can test

Testing • • You You cannot test enough can test too much can test wrong can think wrong • But you must test TEKNILLINEN KORKEAKOULU HELSINKI UNIVERSITY OF TECHNOLOGY olli. seppala@hut. fi 0