Hardware Software Support for MixedCriticality Multicore Systems Glenn

  • Slides: 34
Download presentation
Hardware & Software Support for Mixed-Criticality Multicore Systems Glenn Farrall, Infineon Technologies; Claus Stellwag,

Hardware & Software Support for Mixed-Criticality Multicore Systems Glenn Farrall, Infineon Technologies; Claus Stellwag, Elektrobit Automotive; Jonas Diemer, TU Braunschweig; RECOMP is made possible by funding from the ARTEMIS Joint Undertaking.

Agenda Tri. Core® Introduction AURIX™ Devices Support for Spatial & Temporal Isolation Interrupt Reliability

Agenda Tri. Core® Introduction AURIX™ Devices Support for Spatial & Temporal Isolation Interrupt Reliability Example Core to Core Communication 2013 -03 -22 WICERT 2013 Presentation 2

Tri. Core Introduction Most widely distributed microcontroller you’ve probably never heard of In approximately

Tri. Core Introduction Most widely distributed microcontroller you’ve probably never heard of In approximately 50% of all automobiles produced this year 32 -bit architecture with a focus on real time (hard) 16/32 bit instruction length, register operations, support for C and DSP native data types and operations. Application areas Automotive powertrain Stability control systems EVehicle: charging, BMS etc. Industrial control 2013 -03 -22 WICERT 2013 Presentation 3

Tri. Core Based Products The marketing engineeringview 2013 -03 -22 WICERT 2013 Presentation 4

Tri. Core Based Products The marketing engineeringview 2013 -03 -22 WICERT 2013 Presentation 4

Agenda Tri. Core Introduction AURIX Devices Support for Spatial & Temporal Isolation Interrupt Reliability

Agenda Tri. Core Introduction AURIX Devices Support for Spatial & Temporal Isolation Interrupt Reliability Example Core to Core Communication 2013 -03 -22 WICERT 2013 Presentation 5

AURIX Multi. Core Devices 2013 -03 -22 WICERT 2013 Presentation 6

AURIX Multi. Core Devices 2013 -03 -22 WICERT 2013 Presentation 6

Agenda Tri. Core Introduction AURIX Devices Support for Spatial & Temporal Isolation Interrupt Reliability

Agenda Tri. Core Introduction AURIX Devices Support for Spatial & Temporal Isolation Interrupt Reliability Example Core to Core Communication 2013 -03 -22 WICERT 2013 Presentation 7

Spatial Isolation: MPUs Access range n . . . Definitions MPU – Memory Protection

Spatial Isolation: MPUs Access range n . . . Definitions MPU – Memory Protection Unit MMU – Memory Management Unit MPU √ Χ a MPU provides functionality to upper bound n rights check memory and I/O accesses are n lower bound n allowed, at minimum some description of the address region covered (explicit or implicit) some combinations of fetch, read and write permissions 3 FFF rd, ex 2000 a Trap or Traps (exception) 1 FFF rd, wr, mechanism when access is ex 0000 not permitted without additional storage in the trap LD D 0, 0 x 01000 system there is little utility for an ST D 0, 0 x 03000 MMU in embedded systems. CPU 2013 -03 -22 WICERT 2013 Presentation Access Range 1 data at Access Range 0 0 x 01000 8

Spatial Isolation: Risks MPU in cores allows memory regions to be allocated safely, with

Spatial Isolation: Risks MPU in cores allows memory regions to be allocated safely, with 2 caveats. 1. The MPU (in a core) doesn’t protect memory from other bus masters if they are misconfigured Mixed critical software – must presume one core is running “unsafe” software 2. MPU doesn’t protect memory from soft error events (or hard errors which occur during runtime) after an address has been checked by the MPU 2013 -03 -22 WICERT 2013 Presentation 9

Risks Remaining w. MPU Protected memory for CPU 0 configured MPU soft error configured

Risks Remaining w. MPU Protected memory for CPU 0 configured MPU soft error configured MPU 2013 -03 -22 misconfigured (QM code) configured MPU WICERT 2013 Presentation misconfigured (QMerror code) soft 10

Temporal Isolation & WCET For multicore systems, determining a WCET estimate can be problematic

Temporal Isolation & WCET For multicore systems, determining a WCET estimate can be problematic Co-running applications will provide interference lengthening the run time, and may not be known when timing budgets are set to make matters worse determining the highest interfering co-application(s) is not easy either. A system which prevents (or avoids) high interference (i. e. provides temporal isolation), means that pessimism in the WCET estimate can be much smaller. 2013 -03 -22 WICERT 2013 Presentation 11

Interference Controllable System The AURIX family has support for controlling timing interference. main interconnect

Interference Controllable System The AURIX family has support for controlling timing interference. main interconnect is a crossbar, with no contention for disjoint resource access Achieves low interference with specific usage decisions allocation of resources to specific tasks and cores enforcement of that allocation by MPU (and access gate) configuration Remaining temporal interference is just due to peripheral bus interference between applications is now comparable to DMA & controllable by arbitration priority decisions. 2013 -03 -22 WICERT 2013 Presentation 12

Agenda Tri. Core Introduction AURIX Devices Support for Spatial & Temporal Isolation Interrupt Reliability

Agenda Tri. Core Introduction AURIX Devices Support for Spatial & Temporal Isolation Interrupt Reliability Example Core to Core Communication 2013 -03 -22 WICERT 2013 Presentation 13

Interrupt Infrastructure soft error CPU Interrupt Router (IRU) directs trigger events software interrupt pin

Interrupt Infrastructure soft error CPU Interrupt Router (IRU) directs trigger events software interrupt pin signal peripheral state events, e. g. timer count down or comm data arrival IRU is highly configurable per trigger event select service provider (CPUx or DMA) select priority (ISR priority on CPUx or DMA channel) Soft (and hard) errors could corrupt stored state in the IRU could change values transferred to the Service Provider 2013 -03 -22 WICERT 2013 Presentation 14

Protecting Interrupt Integrity There is ECC on the protocol between the IRU and Service

Protecting Interrupt Integrity There is ECC on the protocol between the IRU and Service Providers to check state is correct when it is used Ensures correct service provider is signaled Ensures correct priority/DMA channel is received Ensures trigger event was correctly enabled An ISR executing on a CPU can be sure it has correctly been initiated by the correct trigger source, the only remaining check required in SW is on interrupt rate too many interrupts: babbling idiot? too few interrupts, e. g. wrong time base, or perhaps none due to a trace broken Programming of IRU is protected by gate mechanism => can restrict access to only trusted CPUs/tasks. 2013 -03 -22 WICERT 2013 Presentation 15

Agenda Tri. Core Introduction AURIX Devices Support for Spatial & Temporal Isolation Interrupt Reliability

Agenda Tri. Core Introduction AURIX Devices Support for Spatial & Temporal Isolation Interrupt Reliability Example Core to Core Communication 2013 -03 -22 WICERT 2013 Presentation 16

Core-2 -Core Module Claus Stellwag (Elektrobit) March 2013 – WICERT RECOMP is made possible

Core-2 -Core Module Claus Stellwag (Elektrobit) March 2013 – WICERT RECOMP is made possible by funding from the ARTEMIS Joint Undertaking.

Concepts (1) Basic Should be fit into an AUTOSAR system Static approach / no

Concepts (1) Basic Should be fit into an AUTOSAR system Static approach / no dynamics HW assumption: shared memory accessible on all cores. Safety No propagation of faults over C 2 C Lockfree behaviour ( No deadlocks) Only “local” writes, allow protection mechanisms (MPU) State handling ( Safety state) 26 November 2020, Slide 18

Concepts (2) Initializing Requirement to update cores without need to update all. How to

Concepts (2) Initializing Requirement to update cores without need to update all. How to find other core structures after update? Search for other cores Add / remove cores 26 November 2020, Slide 19

Concepts (3) States of the C 2 C module 26 November 2020, Slide 20

Concepts (3) States of the C 2 C module 26 November 2020, Slide 20

Concepts (4) Communication based on. . Channels (“message box”) Messages are send/received with channels

Concepts (4) Communication based on. . Channels (“message box”) Messages are send/received with channels “last is best” semantic 1 sender (core) and multiple receiver (cores) Receiver Channel Sender Message Receiver 26 November 2020, Slide 21

Concepts (5) Sender Core Local RAM Receiver Cores TASK Channel TASK Message TASK 26

Concepts (5) Sender Core Local RAM Receiver Cores TASK Channel TASK Message TASK 26 November 2020, Slide 22

Configuration 26 November 2020, Slide 23

Configuration 26 November 2020, Slide 23

Questions RECOMP is made possible by funding from the ARTEMIS Joint Undertaking.

Questions RECOMP is made possible by funding from the ARTEMIS Joint Undertaking.

Prototyping 2013 -03 -22 WICERT 2013 Presentation 25

Prototyping 2013 -03 -22 WICERT 2013 Presentation 25

2013 -03 -22 WICERT 2013 Presentation 26

2013 -03 -22 WICERT 2013 Presentation 26

Eco. System Debugging 2013 -03 -22 WICERT 2013 Presentation 27

Eco. System Debugging 2013 -03 -22 WICERT 2013 Presentation 27

Application Prototyping 2013 -03 -22 WICERT 2013 Presentation 28

Application Prototyping 2013 -03 -22 WICERT 2013 Presentation 28

Supporting Material if Required RECOMP is made possible by funding from the ARTEMIS Joint

Supporting Material if Required RECOMP is made possible by funding from the ARTEMIS Joint Undertaking.

Tri. Core Memory Protection Unit (MPU) The memory protection system allows up to 8

Tri. Core Memory Protection Unit (MPU) The memory protection system allows up to 8 code regions to be accessed concurrently The memory protection system allows up to 16 data regions to be accessed concurrently these regions can grant access to peripheral addresses as well as memory addresses. e. g. can configure protection so that Task. A can load/store directly to SPI 0, but not SPI 1; while Task. B can load/store directly to SPI 1, but not SPI 0. 2012 -03 -12 RECOMP DATE Tutorial 2012 30

Memory Range Definitions Code Protection Range Registers Data Protection Range Registers code range 7

Memory Range Definitions Code Protection Range Registers Data Protection Range Registers code range 7 upper bound 7 lower bound 7 upper bound 15 lower bound 15 upper bound 1 lower bound 1 upper bound 0 lower bound 0 data range 2 data range 1 . . . data range 15 upper bound 2 lower bound 2 code range 1 upper bound 1 lower bound 1 code range 0 upper bound 0 lower bound 0 data range 0 2012 -03 -12 RECOMP DATE Tutorial 2012 31

Memory Protection Sets (0. . 3) Code Protection Range Registers √ execution code range

Memory Protection Sets (0. . 3) Code Protection Range Registers √ execution code range 7 permitted upper bound 7 lower bound 7 Data Protection Range Registers upper bound 15 lower bound 15 Χ √ upper bound 1 lower bound 1 upper bound 0 lower bound 0 Execute Enable Register Sets Set 00 -3 norange access data 1 execution code range 1 NOT permitted execution code range 0 permitted readrange & write 0 data . . . only 2 dataread range . . only 15 datawrite range upper bound 2 lower bound 2 √ Χ upper bound 1 lower bound 1 Χ Χ upper bound 0 lower bound 0 √ √ Read Enable Write Enable Register Sets Set 0 -3 0 Register Sets Set 00 -3 Note: by construction any address not enabled in a range definition has no execute or data access 2012 -03 -12 RECOMP DATE Tutorial 2012 32

Memory Protection System Traps Any non-permitted access (memory or peripheral) takes a protection trap

Memory Protection System Traps Any non-permitted access (memory or peripheral) takes a protection trap PROTECTION READ PROTECTION WRITE PROTECTION EXECUTION with the violating address and other information available This information allows the supervisor code to decide to either terminate the task, or emulate the access on behalf of the task, or reconfigure the memory protection system to permit the access and then return to the task to retry the access, or any other action that is suitable to the system (e. g. suspend the task pending some other operation). 2012 -03 -12 RECOMP DATE Tutorial 2012 33

Tri. Core Interrupt Priority Numbers The PN (interrupt Priority Number) of cores go from

Tri. Core Interrupt Priority Numbers The PN (interrupt Priority Number) of cores go from 0 (lowest) to 255 (highest). Event triggers from peripherals or software managed by service request nodes in the interrupt router these contain a SRPN (Service Request Priority Number) between 0 and 255 An interrupt is taken by a core when two conditions are true the core’s interrupt enable bit (ICR. IE) is set (1) the SRPN of an incoming interrupt is greater than the cores Current CPU Priority Number (ICR. CCPN) If a CPU has a CCPN of 0 and IE is 1 then any interrupt with an SPRN >0 will be taken If CPU is at CCPN of 255, regardless of the IE value no interrupt received will be taken. Programming an SRPN to 0 will cause it to never be taken 2012 -03 -12 34 RECOMP DATE Tutorial 2012