Exploring Research Challenges in Security Cyber Physical Systems
Exploring Research Challenges in Security: Cyber Physical Systems Focus Krishnaprasad Thirunarayan (T. K. Prasad) Professor, Department of Computer Science and Engineering Kno. e. sis - Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, OH-45435 [Some Slides from (Insup Lee, CPS-course) (M. Farag’s dissertation) and papers by (Neumann, Lee, Northcutt, etc)] 12/17/2021 Trust Networks: T. K. Prasad 1
Broad Outline • Cyber Physical Systems : Trends and Characteristics • Applications and Grand Vision • Interaction Complexity Illustrated • Issues and Approaches to improve CPS Security – Static, Dynamic, Reconfiguration, Monitoring nonfunctional aspects • Further Examples of CPS Attacks and Detection Techniques in Transportation 12/17/2021 Trust Networks: T. K. Prasad 2
Example Embedded Systems Automobiles Entertainment Medical Airplanes Handheld Military Spring ‘ 10 CIS 541 3
The Next Computing Revolution • Mainframe computing (60’s-70’s) – Large computers to execute big data processing applications • Desktop computing & Internet (80’s-90’s) – One computer at every desk to do business/personal activities • Ubiquitous computing (00’s) – Numerous computing devices in every place/person – “Invisible” part of the environment – Millions for desktops vs. billions for embedded processors • Cyber Physical Systems (10’s) Spring ‘ 10 CIS 541 4
Definition • A Cyber Physical Systems integrates computing, communication, and storage capabilities with the monitoring and/or control of entities in the physical world dependably, safely, securely, efficiently and in real -time. 12/17/2021 Trust Networks: T. K. Prasad 5
12/17/2021 Trust Networks: T. K. Prasad 6
Characteristics of CPS • Interact with physical environment and devices (Inputs and Feedback). • Uncertainty in (sensor) readings. • Varying device trust levels. • Real-time (actuator) performance requirements • Geographically distributed, with components in locations that lack physical security. • Multi-scale and systems of systems control characteristics : Local vs global 12/17/2021 Trust Networks: T. K. Prasad 7
Confluence of Trends : The Overarching Challenge PROBLEM: The exponential proliferation of embedded devices (afforded by Moore’s Law) is not matched by a corresponding increase in human ability to consume information or control the system! Trend 2: Integration at Scale (Isolation has cost) Trend 1: Device/Data Proliferation (by Moore’s Law) NEED: Reliable Distributed Cyber-Physical Information Distillation and Control Systems (of Embedded Devices) Trend 3: Autonomy (Humans are not getting faster) Spring ‘ 10 CIS 541 8 [TA]
Example: Automotive Telematics § § In 2005, 30 -90 processors per car o Engine control, Brake system, Airbag deployment system o Windshield wiper, door locks, entertainment systems o Example: BMW 745 i 2, 000 LOC Window CE OS Over 60 microprocessors s 53 8 -bit, 11 32 -bit, 7 16 -bit Multiple networks Buggy? Cars are sensors and actuators in V 2 V networks o Active networked safety alerts o Autonomous navigation o … Spring ‘ 10 CIS 541 9
Application Domains of Cyber-Physical Systems § Healthcare o Medical devices o Health management networks § Transportation o o o Automotive electronics Vehicular networks and smart highways Aviation and airspace management Avionics Railroad systems § Process control (Chemical Plants/Pharmaceuticals) § Large-scale Infrastructure Healthcare Transportation CP S Finance … o Physical infrastructure (e. g. , bridges, dams, locks) monitoring and control o Electricity generation and distribution o Building and environmental controls § Defense systems § Tele-physical operations o Telemedicine o Tele-manipulation In general, any “X by wire(less)” where X is anything that is physical in nature. CIS 541 Spring ‘ 10 10
Grand Visions and Societal Impact § Near-zero automotive traffic fatalities, injuries minimized, and significantly reduced traffic congestion and delays § Blackout-free electricity generation and distribution § Perpetual life assistants for busy, older or disabled people § Extreme-yield agriculture § Energy-aware buildings § Location-independent access to world-class medicine § Physical critical infrastructure that calls for preventive maintenance § Self-correcting and self-certifying cyber-physical systems for “one-off” applications § Reduce testing and integration time and costs of CIS 541 Spring ‘ 10 complex CPS systems (e. g. avionics) by orders of 11
Societal Challenge § How can we provide people and society with cyber-physical systems that they can trust their lives on? Trustworthy: reliable, secure, privacypreserving, usable, etc. Spring ‘ 10 CIS 541 § Partial list of complex system failures o Denver baggage handling system ($300 M) o Power blackout in NY (2003) o Ariane 5 (1996) o Mars Pathfinder (1997) o Mars Climate Orbiter ($125 M, 1999) o The Patriot Missile (1991) o USS Yorktown (1998) o Therac-25 (1985 -1988) o London Ambulance System (£ 9 M, 1992) o Pacemakers (500 K recalls during 1990 -2000) o Numerous computer-related Incidents with commercial aircraft (http: //www. rvs. unibielefeld. de/publications/compend ium/incidents_and_accidents/inde x. html) 12
Interaction Complexity § We know how to design and build components. § Systems are about the interactions of components. o Some interactions are unintended and unanticipated Interoperability issues Emerging behaviors § Emerging behavior: Unanticipated / difficult to predict behavior due to interconnection and interaction between large number of small components Spring ‘ 10 CIS 541 13
Interaction Complexity § “Normal Accidents”, an influential book by Charles Perrow (1984) o One of the Three Mile Island investigators o And a member of recent NRC Study “Software for Dependable Systems: Sufficient Evidence? ” o A sociologist, not a computer scientist § Posits that sufficiently complex systems can produce accidents without a simple cause due to o interaction complexity and tight coupling Spring ‘ 10 CIS 541 14
Unexpected interactions : MARS Pathfinder § Landed on the Martian surface on July 4 th, 1997 § Unconventional landing – boucing into the Martian surface § A few days later, after Pathfinder started gathering meteorological data, the spacecraft began experiencing total system reset, each resulting in data loss. CIS 541 Spring ‘ 10 Incompatible Cross Domain Protocols Pathological Interaction between RT and synchronization protocols caused Pathfinder to repeatedly reset, nearly dooming the mission [Sha] 15
The Priority Inversion Problem Priority order: T 1 > T 2 > T 3 failed attempt to lock R lock(R) unlock(R) T 1 T 2 lock(R) unlock(R) T 3 T 2 is causing a higher priority task T 1 to wait ! Spring ‘ 10 CIS 541 16
Priority Inversion 1. T 1 has highest priority, T 2 next, and T 3 lowest 2. T 3 comes first, starts executing, and acquires some resource (say, a lock). 3. T 1 comes next, interrupts T 3 as T 1 has higher priority 4. But T 1 needs the resource locked by T 3, so T 1 gets blocked 5. T 3 resumes execution (this scenario is still acceptable so far) 6. T 2 arrives, and interrupts T 3 as T 2 has higher priority than T 3, and T 2 executes till completion 7. In effect, even though T 1 has priority than T 2, and arrived earlier than T 2, T 2 delayed execution of T 1 8. This is “priority inversion” !! Not acceptable. Spring ‘ 10 CIS 541 17
Priority Inversion and the MARS Pathfinder § What happened: o Pathfinder has an “information bus” thread [very critical – used by navigation, etc. – high priority] o The meteorological data gathering thread ran as an infrequent, low priority thread, and used the information bus to publish its data (while holding the mutex on bus). o A communication task that ran with medium priority. o It is possible for an interrupt to occur that caused (medium priority) communications task to be scheduled during the short interval of the (high priority) information bus thread was blocked waiting for the (low priority) meteorological data thread. o After some time passed, a watch dog timer goes off, noticing that the data bus has not been executed for some time, it concluded that something had gone really bad, and initiated a total system reset. Spring ‘ 10 CIS 541 18
Priority Inheritance Protocol lock R fails lock(R) unlock(R) T 1 T 2 lock(R) T 2 arrives unlock(R) T 3 blocks T 2 T 3 has priority of T 1 Spring ‘ 10 CIS 541 T 3 directly blocks T 1 19
A (Research) Vision § To provide CPS application engineers with lightweight “push-button” tools, each checking a specific application-specific property [Wing]. Check Deadlock Spring ‘ 10 Check Race Check Schedulabilit y CIS 541 Check Power usage Check Memory usage Check Privacy 20
Sources of difficulties : Theory and Practice § Unsound compositionality (Theory) o incompatible abstractions, incorrect or implicit assumptions in system interfaces. o incompatible real time, fault tolerance, and security protocols. o combination of components do not preserve functional and para-functional properties; unexpected feature interactions. § Inadequate development infrastructure (Development) o the lack of domain specific-reference architectures, tools, and design patterns with known and parameterized real time, robustness, and security properties. § System instabilities (Nature of Complexity) o faults and failures in one component cascade along complex and unexpected dependency graphs resulting in catastrophic failures in a large part or even an entire system. Spring ‘ 10 CIS 541 21
Improving Trustworthiness of CPS 12/17/2021 Trust Networks: T. K. Prasad 22
Simplified CPS Architecture Source: Farag dissertation 12/17/2021 Trust Networks: T. K. Prasad 23
Design Trend in CPS To increase development productivity and reduce design costs, embedded controllers are often assembled from commercial of-the-shelf (COTS) components and third-party intellectual property (IP) modules. 12/17/2021 Trust Networks: T. K. Prasad 24
Example: Aircraft Systems • Over 95% of the components used in aircraft systems are COTS based. • COTS devices include integrated circuits from commercial manufacturers (e. g. , Intel, AMD, LSI, TI). • Aircraft avionics manufacturers are responsible for traceability to ensure that counterfeit parts are detected and rejected. • Aircraft systems have built-in-test software, real time monitors and voting to detect and isolate failed ICs caused by degradation. 12/17/2021 Trust Networks: T. K. Prasad 25
Vulnerability/Threats to CPS • Vulnerability: Traditional weaknesses in Hardware and Software (including bugs, planted trojans/backdoors). • Threats: Exploitation of the physical aspects: An malicious attacker can cause a coordinated series of physical actions which can lead the system to respond in an unexpected manner. 12/17/2021 Trust Networks: T. K. Prasad 26
Vulnerability/Threats to Code • Example: It is possible to automatically conceal trigger-based malicious behavior of existing malware from any static or dynamic input-oblivious analyzer by an automatically applicable obfuscation scheme based on static analysis. Sharif et al, 2008 12/17/2021 Trust Networks: T. K. Prasad 27
Original and Obfuscated Code 12/17/2021 Trust Networks: T. K. Prasad 28
Generalization 12/17/2021 Trust Networks: T. K. Prasad 29
Vulnerability/Threats to CPS § Uncertainty in the environment. § Human errors in operation. § Errors in physical devices. § Security attacks. § MOTIVATES: Ensuring CPS safety against realistic models of environment uncertainties, possible operator mistakes, imperfections in physical and/or cyber components, and malicious security attacks. 12/17/2021 Trust Networks: T. K. Prasad 30
Practical Reliability Requirements • Agility: Failure to react in a timely manner can result in cascading failures, and possibly, permanent damage to equipment. • Automated fault detection and recovery. • Performance isolation and Prioritization: Overload of the system due to one function should not impact the system availability or capacity to meet a time-critical function. 12/17/2021 Trust Networks: T. K. Prasad 31
Practical Reliability Requirements • • Resilience: Geographical dispersion of CPS components makes it difficult to physically reset, or reload the software on a compromised device. Security solutions should focus on resilience, rather than solely on preventing compromise. Robust Design: Functional and security aspects should be considered together when designing the system. Security failures can result from: • Inadequate specification • Incorrect implementation • Emergent Properties 12/17/2021 Trust Networks: T. K. Prasad 32
Examples of Design Approaches to obtain Robustness at Higher-levels • Reconfiguration: – In Digital Circuits, increasing the yield of components that can fail (say, 20% of the time) is easier than creating highly reliable (~100%) components. So designs that can monitor and reconfigure around failed components is valuable. • Selective Strategies: – Count on reliable gates because it is (still) technically feasible to make them predictable and reliable. – It is harder to make wireless links predictable and reliable. So we compensate one level up, using robust coding and adaptive protocols. Ref: Edward Lee 12/17/2021 Trust Networks: T. K. Prasad 33
Example Abstractions 12/17/2021 Trust Networks: T. K. Prasad 34
12/17/2021 Trust Networks: T. K. Prasad 35
Examples of Design Approaches to obtain Robustness at Higher-levels These abstractions are inadequate for CPS because they do not capture concurrency and timing behavior appropriately. Ref: Edward Lee 12/17/2021 Trust Networks: T. K. Prasad 36
Downside of Failure of Abstraction • Safety-critical embedded systems, such as avionics control systems for passenger aircraft, are forced into an “encased box” mentality. • To assure a 50 year production cycle for a fly-by-wire aircraft, an aircraft manufacturer is forced to purchase, all at once, a 50 year supply of the microprocessors that will run the embedded software. • The systems will be unable to benefit from the next 50 years of technology improvements without redoing the (expensive) validation and certification of the software. Ref: Edward Lee 12/17/2021 Trust Networks: T. K. Prasad 37
Robustness: Static vs Dynamic § Static: Design-time techniques to verify that a computing system is flaw-free pre-deployment. § Dynamic: Run-time techniques that employ trusted components to provide assurances about certain aspects of system behavior. § Detect before the attack and Prevent § Detect after the attack and Repair § Self-healing: There is also need for proactive protections to be built-in during design. 12/17/2021 Trust Networks: T. K. Prasad 38
(cont’d) § Static/PROACTIVE: Authorization, Access Control, Accountability (logs and audit trails), Encryption of communication, Redundancy/Diversity, etc. § Dynamic/REACTIVE: Intrusion detection, etc. § Design and Analysis: Adversary models, Trustworthiness, etc. 12/17/2021 Trust Networks: T. K. Prasad 39
Robustness : Reality • Usually hardware verification is more manageable computationally than software verification. • Software modification of hardware structure is analogous to self-surgery, and independent hardware should provide oversight rather than rely solely on the correctness and integrity of application software and circuits. 12/17/2021 Trust Networks: T. K. Prasad 40
Hardware Trojan Horses • HTH effects include subtle changes to the chip functionality, performance degradation, information leakage, and denial-of-service (Do. S). • HTH detection approaches include physical analysis, logic-testing, Trojan activation, and side-channel analysis. • Side-channel parameters such as current consumption, power trace, and path delay. 12/17/2021 Trust Networks: T. K. Prasad 41
Approaches to improve trustworthiness • CORRECTNESS: Verifying that the design satisfies functional specification. • EXPLOITING CORRELATION: Monitor non-functional side-effects of the required activities (side-channel analysis) and determine if within allowable range 12/17/2021 Trust Networks: T. K. Prasad 42
Approaches to improve trustworthiness • RUNTIME MONITORING: Detecting anomalous activity or constraint violation • Use prediction logic that runs faster than real time in order to predict the future state of the physical system. • However, Run-time Enhancement of Trusted Computing (RETC) can only protect systems and components having clearly stated security policies. Ref: Farag Dissertation 12/17/2021 Trust Networks: T. K. Prasad 43
Malicious Security Attacks in CPS 12/17/2021 Trust Networks: T. K. Prasad 44
Example Attacks in Transportation: Vulnerabilities and Threats • Malicious signals sent via the On. Star Telecommunications network to remotely control vehicle (Koscher et al. , 2010). – Enable adversarial control of automotive functions ignoring driver input, such as disabling the brakes, selectively braking wheels, and stopping the engine. – Due to inadequacies in security protocols in vehicles. • A Near-Field Communication (NFC) chip can be used near an NFC-enabled smart-phone to open a webpage on phone and install a remote access program. Ref: Northcutt 2013 12/17/2021 Trust Networks: T. K. Prasad 45
Cyber Security vs CPS Security • Fraudulent emails or file alterations. (potentially reversible with backups) • Hacking a financial transaction or stealing credit card numbers. (temporary economic instability) 12/17/2021 • Accelerating a car, or malicious release of excessivewater/sewage. (irrevesible damage possible) • Overheating nuclear power plant and release of radioactive material. (severe both physical and financial impact) Trust Networks: T. K. Prasad Ref: Northcutt 2013 46
12/17/2021 Trust Networks: T. K. Prasad 47
Orthogonal Classification and Remedy • Deception (or lack of integrity): Prevent, detect, or survive false information sent and received by sensors/controllers/actuators. • Denial of Service (or lack of availability): Timely prevention or survival • Privacy violation (or breach of confidentiality) : Hide system state and communication data Ref: Cardenas et al 2008 12/17/2021 Trust Networks: T. K. Prasad 48
Assorted techniques to improve security • Formalize design constraints on the system and detect compromise as violation of these constraints. • Compare behavior of application replicas executed on different operating systems, to identify malicious behavior due to attacks. – Stuxnet malware targets only Siemens PLCs for Uranium treatment and is propagated only via Windows PC – Use virtualization and (abstract) trace comparisons to detect attacks • Hurdle: Behavior equivalence in the presence of concurrency and timing issues 12/17/2021 Trust Networks: T. K. Prasad 49
Assorted techniques to improve security • Anti-tamper techniques are being developed to detect and prevent the original functionality from modification by deception. • Self-destructive hardware/software techniques should be developed to prevent enemy from reverse engineering IP in case an artifact is lost/stolen. – US destroyed a Black Hawk helicopter during Osama bin Laden raid to avoid compromising sensitive technology. • Hurdle: Detection of control by adversary, and autonomous and effective destruction. 12/17/2021 Trust Networks: T. K. Prasad 50
- Slides: 50