UltraHigh Energy Heavy Ions radiation tests on COTS

  • Slides: 22
Download presentation
Ultra-High Energy Heavy Ions radiation tests on COTS FPGAs at CERN: results for Micro.

Ultra-High Energy Heavy Ions radiation tests on COTS FPGAs at CERN: results for Micro. Semi Pro. ASIC 3 and Xilinx Zynq APSo. C Lucana Santos, Antonis Tavoularis, Gianluca Furano (ESA) George Lentaris, Konstatinos Maragos (NTUA) Lars Juul (Lux. Space) Dejan Gačnik (Sky. Labs) 28/03/2018 ESA UNCLASSIFIED - For Official Use

Introduction - ESA-CERN cooperation gave certain ESA projects access to the most intense beam

Introduction - ESA-CERN cooperation gave certain ESA projects access to the most intense beam of ultra high energy heavy ions available at the Super Proton Synchrotron (SPS). - Several tests could take place simultaneously due to the high penetration of the beam. - Unique opportunity to test a broad spectrum of devices, including complex COTS, and estimate the architecture vulnerability factor (AVF). - Beam: Xenon 30 and 40 Ge. V/c. ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 2

Outline - Test Set-up description -Presenter: A. Tavoularis (ESA) - TEST#1 – Zynq APSo.

Outline - Test Set-up description -Presenter: A. Tavoularis (ESA) - TEST#1 – Zynq APSo. C -Presenter: A. Tavoularis (ESA) - TEST#2 – Zynq APSo. C -Presenter: G. Lentaris (NTUA) - TEST#3 - Pro. ASIC 3 FPGA and Pico. Sky. FT -Presenter: Dejan Gačnik (skylabs) ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 3

TEST#1 and TEST#2: Set-up Purpose: - Measure Caches sensitivity (Zynq Processing Subsystem) - Measure

TEST#1 and TEST#2: Set-up Purpose: - Measure Caches sensitivity (Zynq Processing Subsystem) - Measure CRAM sensitivity (Zynq programmable Logic) Set-up: - Dual board testing - Remote control of the entire setup - Lab. View for toggles/current recording - Common time-stamp - “Beam signal” not available ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 4

TEST#1: Methodology – results recording APSo. C under test: - Zed. Board with Zynq

TEST#1: Methodology – results recording APSo. C under test: - Zed. Board with Zynq z 7020 So. C Methodology: - Vivado -> PS running Coremark software on ARM Cortex-A 9. - Script periodically searches for errors in Programmable Logic configuration memory and stores the readback file (CRAM SEU) - One of the GPIO pins of the Zynq is used to detect if Coremark is running, by toggling at 2 Hz (PS SEFI) - FPGA re-programmed and the - software re-launched whenever the Coremark application stops running - Results from Coremark are logged through the UART and the readback files are stored for later analysis - Automated test scripts provided by Lux. Space ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 5

TEST#1: Zynq Processing Subsystem - results Ion count to SEFI vs. Cache configuration -

TEST#1: Zynq Processing Subsystem - results Ion count to SEFI vs. Cache configuration - Coremark benchark measures embedded systems performance - Four different (No cache, D-only, I-only, Both) _C AC - Coremark score not affected by irradiation LL K_ AR FU CO RE M K_ CO RE AR M RE CO CO RE - No cache most robust, Icache worst SEFI sensitivity M AR K_ FU LL - SEFIs depend on caches configuration ESA UNCLASSIFIED - For Official Use H E _C A F M CH AR UL CO L E _C K_ RE A F M C AR UL CO L_ HE K RE CA _F M U A LL CH E CO RK _C RE _F A U C M AR LL_ HE CO CA K_ RE C N M CO AR O_ HE RE CA K_ M AR NO CH CO E _C K_ RE AC N M O _D HE CO AR K CA RE _N C M O AR _D HE CO K_ CA RE N C M O AR CO _I HE K_ CA RE N CH M AR O_ CO E I K_ CA RE N CH M AR O_ E I K_ CA N C O _I HE CA CH E 5 E+06 4 E+06 3 E+06 2 E+06 1 E+06 5 E+05 0 E+00 Acc Beam During Coremark Execution L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 6

TEST#1: Zynq Programmable Logic – results (1/3) - Readback performed every two minutes -

TEST#1: Zynq Programmable Logic – results (1/3) - Readback performed every two minutes - MSK file filters out BRAM/SLICE bits Acc Beam between consecutive readbacks 30000 25000 - Readback file stored in case of mismatch for post-processing - Number of errors in readback file. FPGA reprogram on PS SEFI 2500000 20000 15000 1000000 10000 500000 0 : 01 : 01 2: 00 7 16: 48 7 21: 36 7 02: 24 7 07: 12 7 12: 00 7 16: 48 7 21: 36 1 7 1 1 1 1 1/20 0/11/20 1/12/20 1/12/20 30/1 3 3 0 0 0 ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 7

TEST#1: Zynq Programmable Logic – results (2/3) Bit Cross Section: #errors/(Ion. Fluence*23. 8 Mbits)

TEST#1: Zynq Programmable Logic – results (2/3) Bit Cross Section: #errors/(Ion. Fluence*23. 8 Mbits) Cross Section Where: - 1. 0000 E-05 #errors = Verify. Errors - Ion. Fluence = SUM(Beam. Value(t) to Beam. Value(t+Delta)) - Delta = Time between consecutive readbacks 1. 0000 E-06 Loss of synchronization with beam 1. 0000 E-07 1. 0000 E-08 1. 0000 E-09 Results: - Min: 1. 07 E-11 (Loss of Sync) 1. 0000 E-10 - Max: 5. 75 E-08 (Loss of Sync) 1. 0000 E-11 - Med: 1. 15 E-09 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 141 148 155 162 169 176 183 190 197 204 211 218 225 232 1. 0000 E-12 ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 8

Occurrences TEST#1: Zynq Programmable Logic – results (3/3) Cross Section ESA UNCLASSIFIED - For

Occurrences TEST#1: Zynq Programmable Logic – results (3/3) Cross Section ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 9

TEST#1: Conclusions - PS more sensitive when caches enabled (no cache ECC) - SEFIs

TEST#1: Conclusions - PS more sensitive when caches enabled (no cache ECC) - SEFIs even with Dcache enabled (pointers? ) - CRAM cross section ~1 E 0 -9, similar to literature results - CRAM Cross section similar to NTUA Beam penetration big, cascaded set-ups - 500 out of 1000 tests analyzed - MBUs or MCUs observed - Beam signal recording, common timestamping, readback on beam idle, would allow increase accuracy and ease post-processing - Thanking Lux. Space for their inputs and support (performed under Triton. X framework) ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 10

TEST#2: Goals - Methods NTUA 1. CRAM cross-section under full/realistic operation of DUT#2 •

TEST#2: Goals - Methods NTUA 1. CRAM cross-section under full/realistic operation of DUT#2 • entire FPGA processing data during beam time Ø stop & readback at every ion spill (do not accumulate) 2. SEE evaluation at system level (So. C, PS+PL+memory) • without rad-hard or mitigation techniques • ordinary HW/SW co-processing with benchmarks Ø PL: multiple VHDL FIRs of various sizes and locations v e. g. , total, 56% LUT, 31% DFF, 100% DSP, ~0% RAMB Ø PS: bare-metal SW to feed/control/check the PL v store test-vectors (+redundancy), continuously verify I/O data (1 Msamples) Ø use additional pins/LEDs to monitor macroscopic state (e. g. , for SEFI) ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 11

TEST#2: Results NTUA 1. no SEL, no hard error, only 2 SEEs required power

TEST#2: Results NTUA 1. no SEL, no hard error, only 2 SEEs required power cycling • all other errors corrected via reprogramming and new data flow 2. every spill caused errors, and SEFI occurred every 1 -10 spills • SEFI 80% due to CPU crash when cache=ON (stops already at 1 st spill) • SEFI 90% due to lost data at PL when cache=OFF (runs up to 10 spills) 3. CRAM bit-cross-section = 1. 0– 1. 8 E-9 ≈DUT#1 (other approach, other position) • 3– 4% ions cause events, 15– 20% of which MCU/MBU. P(1 0)=P(0 1). • bit-cross-section is roughly twice vs similar works in the literature • is equal to soft errors at user space (e. g. , flip-flop bit-cross-section) 4. DSP-based FIRs withstand (=not fully damaged) a spill with 2 x possibility vs LUT-based (40% vs 21%, probably due to the smaller CRAM footprint) ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 12

TEST#3: DUT Pico. Sky. FT So. C DUT • FPGA Microsemi Pro. ASIC 3

TEST#3: DUT Pico. Sky. FT So. C DUT • FPGA Microsemi Pro. ASIC 3 • Part name: A 3 PE 3000 L-PQ 208 • FPGA Core bias voltage 1. 5 V • So. C with Pico. Sky. FT-C 3 L processor • Pico. Sky. FT partial. TMR • Peripherals full. TMR • Single clock & reset tree • Using 16 k. B of internal SRAM (EDAC protected) • Register file 32 8 -bit registers (parity protected 8 D + 2 P) – not TMR’ed • So. C frequency 10 MHz • DUT PCB temperature ~25°C ESA UNCLASSIFIED - For Official Use Pico. Sky. FT So. C configuration Pico. Sky. FT-L - Program memory - Data memory Pico. Sky. FT processor (large memory model) SM LUT ROM, UM external MRAM Pro. ASIC int SRAM Processor configuration: - PM EDAC enabled - DM EDAC enabled - RF Parity check enabled - TB enabled - MULTIPLIER enabled EDAC on program memory EDAC on data memory Parity check on register file Trace buffer enabled Multiplication instructions enabled L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 13

TEST#3: Test set-up 1/2 • Reused setup from Rad testing at PIF-PSI (09/2017) Remote

TEST#3: Test set-up 1/2 • Reused setup from Rad testing at PIF-PSI (09/2017) Remote PC located in Maribor, Slovenia ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 14

TEST#3: Test set-up 2/2 ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis,

TEST#3: Test set-up 2/2 ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 15

TEST#3: Pico. Sky. FT Testing FW Running FW emulates a typical program flow on

TEST#3: Pico. Sky. FT Testing FW Running FW emulates a typical program flow on a mission such: - reading and writing to memories, - performing arithmetic operations, - sending data via to peripherals (SPI, UART, GPIOs, ADCs). Single threaded firmware with the following functions were provided: - Heartbeat - 4 GPIOs - AD conversion of 3 channel, - Data integrity check - Core. Mark benchmark test - Fault detection statistics report (to PC) Functions were constantly in a loop, each function, trap or interrupt signalled unique sequence of Heartbeat. Upon unrecoverable error, processor is reset manually. ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 16

TEST#3: Observations Pro. ASIC 3 - Generated of 1. 78 GB of logs -

TEST#3: Observations Pro. ASIC 3 - Generated of 1. 78 GB of logs - Single FPGA device used through 3 days of irradiation -Accumulated TID is still TBC, but is expected to be negligible w. r. t. the 20 Krad limit of Pro. ASIC 3 - No SEU on FPGA configuration -No FPGA re-programming needed - No SEL detected - No device SEFI detected - SEU observed on SRAM and DFF ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 17

Cross section [CS/bit] Fluence [ions / cm 2] # SEU Avg spill count [p

Cross section [CS/bit] Fluence [ions / cm 2] # SEU Avg spill count [p / spill] Beam spill intensity TEST#3: SRAM SEU Test Results (1/2) HIGH_INT 3, 95 E+06 3844 3, 31 E+06 8, 86 E-09 MID_INT 2, 72 E+05 2140 1, 07 E+06 1, 52 E-08 LOW_INT 6, 87 E+03 214 8, 60 E+04 1, 90 E-08 CS=1. 90 E-08 CSmax=1. 52 E-08 CSmin=8. 86 E-09 COMPARISON WITH HIREX REPORT Results still within 1 order of magnitude to Hirex. ESA UNCLASSIFIED - For Official Use LET 3. 7 Me. V. cm 2/mg L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 18

TEST#3: SRAM SEU Test Results (2/2) SEU Events per spill ESA UNCLASSIFIED - For

TEST#3: SRAM SEU Test Results (2/2) SEU Events per spill ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 19

HIGH_INT 4, 03 E+06 Cross section [CS/bit] Fluence [ions / cm 2] # SEU

HIGH_INT 4, 03 E+06 Cross section [CS/bit] Fluence [ions / cm 2] # SEU Avg spill count [p / spill] Beam spill intensity TEST#3: DFF SEU Test Results 23 8, 81 E+06 8, 16 E-09 CS=8. 16 E-09 COMPARISON WITH HIREX REPORT LET 3. 7 Me. V. cm 2/mg ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 20

TEST#3: Conclusions - No SEFI on Pico. Sky. FT processor core -22 k SEU

TEST#3: Conclusions - No SEFI on Pico. Sky. FT processor core -22 k SEU has been detected and corrected in SRAM blocks (SEC events). -7 EDAC DED lead SUPERVISOR code to perform soft-reset to restore operation. Core. Mark result constant - Delta sigma ADC logic have no drift (0 LSB). - Found FW bug 5, 42 E+05 2 1, 52 E+07 CS [ CS / dev] - Fluence [ions / cm 2] -Pico. Sky. FT So. C Reset CS / dev is 1, 32 E-07 # internal resets -Proven 100% safe operation. Avg spill count [p / spill] -Pico. Sky. FT SOC performed 2 hard-resets, due to Register file parity error, while executing SUPERVISOR code. 1, 32 E-07 Pico. Sky. FT So. C Reset CS / dev -Register file has not been cleared after soft-reset. - Pico. Sky. FT So. C cross section based on poor statistics, as “only” 2 resets occurred. ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 21

TEST#3: Next steps - Pico. Sky. FT has been just recently successfully mapped into

TEST#3: Next steps - Pico. Sky. FT has been just recently successfully mapped into NG-Medium. - The So. C design consists of Pico. Sky. FT-L processor, 1 x UART, 1 x Timer, 3 x GPIO ports, and 1 x I 2 C controller. - Next step is to perform radiation characterization of Pico. Sky. FT in Nano. Xplore’s NG-Medium FPGA -Neutrons-Chip. IR, scheduled in last week of April 2018 -Protons-PSI (May 2018), -UHE heavy ions (CERN-HZE campaign with ESA) Pico. Sky. FT So. C Resources utilization ESA UNCLASSIFIED - For Official Use L. Santos, A. Tavoularis, G. Furano, G. Lentaris, K. Maragos, D. Gacnik, L. Juul | 28/03/2018 | Slide 22