Design Codesign of Embedded Systems HardwareSoftware Coverification Maziar
Design & Co-design of Embedded Systems Hardware-Software Co-verification Maziar Goudarzi Fall 2005 Design & Co-design of Embedded Systems
Today Program z. Introduction to Hardware-Software Co-verification Techniques z. A Methodology for HW-SW Co-simulation using System. C Fall 2005 Design & Co-design of Embedded Systems 2
Validation z. Validation vs. verification z. Approaches to validation y. Emulation y. Simulation (co-simulation) y. Formal verification Fall 2005 Design & Co-design of Embedded Systems 3
Validation (cont’d) z Simulation ycannot ensure correctness, but still useful z Heterogeneity y. Weakly heterogeneous x. Lumped, GP computing systems. Simple control systems x. Can be simulated by extending HDL simulators y. Strongly heterogeneous x. Cellular phones, avionics x. Require specialized simulation environments Fall 2005 Design & Co-design of Embedded Systems 4
Co-Validation z. Simulator features for weakly heterogeneous systems y. Adequate timing accuracy y. Fast execution y. Visibility of internal registers for debugging Fall 2005 Design & Co-design of Embedded Systems 5
Co-Validation (cont’d) z. Strategy 1: Use HDL simulator + HDL models for processor and ASICs y. Long HW simulation time for each instruction: accuracy vs. speed tradeoff Fall 2005 Design & Co-design of Embedded Systems 6
Co-Validation (cont’d) z. Strategy 2: avoid processor HDL model y. Use HW/SW comm. Protocol y. SW is compiled and communicates with the HDL simulator which models ASIC y. HDL simulator is bottle-neck y. Internal registers not visible Fall 2005 Design & Co-design of Embedded Systems 7
Validation (cont’d) z. Strategy 3: Emulate HW on a reconfigurable platform y. Automatic partitioning tools to minimize system-simulation time have been developed y. Visibility of internal states is limited => probable slow debugging Fall 2005 Design & Co-design of Embedded Systems 8
Validation (cont’d) z. Simulation of strongly heterogeneous and distributed systems y. Specialized simulators: Ptolemy x. Extesible, OO kernel x. Supports several computation models x. Models are not implemented in simulation kernel, but in domains that can interact without knowing their semantics x. Some developed domains: data-flow, discreteevent. More domains are user-insertable. Fall 2005 Design & Co-design of Embedded Systems 9
Hardware-Software Co-verification Methodology for HW/SW Co-simulation in System. C Fall 2005 Design & Co-design of Embedded Systems
Topics z. Introduction z. Design Flow z. Processor Models z. Implementation: 8051 z. Conclusion Reference: L. Semeria & A. Ghosh, “Methodology for Hardware/Software Co-Verification in C/C++”, in ASP-DAC 2000 Fall 2005 Design & Co-design of Embedded Systems 11
Introduction z. Shrinking device sizes => all digital components on a single chip z. Software is traditionally fully tested after hardware is fabricated => long TTM z. Integrating HW and SW earlier in the design cycle => better TTM z. Co-simulation involves y. Simulating a processor model along with custom HW (usually described in HDL) Fall 2005 Design & Co-design of Embedded Systems 12
Introduction (cont’d) z. Heterogeneous co-simulation environments (C-VHDL or C-Verilog) y. RPC or another form of inter-process communication between HW and SW simulators y. High overhead due to high data transmission between the simulators Fall 2005 Design & Co-design of Embedded Systems 13
Introduction (cont’d) z Recently HW synthesis techniques from C/C++ are more investigated y. Eliminates C to HDL translation for synthesis => higher productivity x. Reduces translation time x. Eliminated bugs introduced during this translation y. Easier verification by xre-using testbenches developed during system validation phase xenabling HW-SW co-verification and performance estimation at very early stages of design Fall 2005 Design & Co-design of Embedded Systems 14
Introduction (cont’d) z. In this paper, authors present y. How HW-SW co-verification is performed in a C/C++ based environment y. HW and SW are both described in C++ (System. C) x. Other C/C++ based approaches: PTOLEMY, and Co. Ware N 2 C, Fall 2005 Design & Co-design of Embedded Systems 15
Methodology for HW/SW Co-verification in System. C Design Flow Fall 2005 Design & Co-design of Embedded Systems
Design Flow Functional Specification of the system Mapping Architectural Specification Refinement of Individual HW and SW blocks Fall 2005 Synthesis for HW blocks Designfor & Co-design of Compilation SW blocks Embedded Systems 17
Methodology for HW/SW Co-verification in System. C Processor Models Fall 2005 Design & Co-design of Embedded Systems
Processor Models z. Bus Functional Model (BFM) z. Instruction-Set Simulator (ISS) Fall 2005 Design & Co-design of Embedded Systems 19
Bus Functional Model (BFM) z. Encapsulates the bus functionality of a processor y. Can execute bus transactions on the processor bus (with cycle accuracy) y. Cannot execute any instructions z. Hence, y. BFM is an abstract model of processor that can be used to verify how a processor interacts with its peripherals Fall 2005 Design & Co-design of Embedded Systems 20
Bus Functional Model (cont’d) At early stages of the design C/C++ SW SW SW BFM HW HW HW In the later stages of the design Assembly Fall 2005 ISS SW SW SW BFM Design & Co-design of Embedded Systems HW HW HW 21
Design of the BFM z. Is a System. C module y. Ports of the module correspond to the pins of the processor y. Methods of the module provide an API (application programming interface) for the software/ISS x. They depend on the type of communication between HW and SW y. BFM functionality is modeled as a set of concurrent FSMs Fall 2005 Design & Co-design of Embedded Systems 22
Memory-mapped IO z. Peripherals are located on a portion of CPU address space z. BFM provided methods void bfm_read_mem(sc_address, sc_data *, int) void bfm_write_mem(sc_address, sc_data, int) z. SW (without ISS) explicitly calls these functions to access HW z. When using ISS, SW calls device drivers. y. Device drivers are run in the ISS and at proper time call these functions Fall 2005 Design & Co-design of Embedded Systems 23
Interrupt-driven IO z. An interrupt controller is implemented in BFM y. It is made sensitive to the CPU interrupt lines z. In case of an interrupt, the corresponding ISR is called z. ISRs are registered by these BFM methods void bfm_register_handler(sc_interrupt, void (*handler)(sc_interrupt)) z. Interrupts may be masked/change behavior using configuration ports Fall 2005 Design & Co-design of Embedded Systems 24
Configuration ports, Access to internal registers z. CPUs often have configuration ports for y. Multiple modes of operation y. Multiple timers/serial modes y. Masked interrupts yetc z. BFM methods to access these registers void bfm_read_reg(sc_register, sc_data*, int nb) void vfm_write_reg(sc_register, sc_data, int nb) z. BFM usually doesn’t model general-purpose registers of the CPU (although it can) Fall 2005 Design & Co-design of Embedded Systems 25
Timers and Serial Ports z. Normally, controllers for these timers and serial ports are implemented within BFM z. They are configured using configuration ports and registers y. Previously mentioned functions are used z. They may issue interrupts to the CPU Fall 2005 Design & Co-design of Embedded Systems 26
Performance Estimation Functions z. BFM keeps track of bus transactions y. Can report number of clock cycles spent for each bus transaction y. Reporting can be taken after each transaction or at the end of simulation y. Tracking is enabled using void bfm_enable_tracing(int level) ylevel is used to define multiple levels of tracking x. Even debug information can be produced by the BFM Fall 2005 Design & Co-design of Embedded Systems 27
HW/SW Synchronization z Normal BFM methods are blocking y. SW execution is suspended until the bus transaction is done y. This essentially serialized SW and HW execution z A flag can be set in the BFM to make SW execute in parallel with HW yi. e. BFM methods return immediately z SW can wait for a specific number of clock cycles by calling y void bfm_idle_cycle(int) Fall 2005 Design & Co-design of Embedded Systems 28
Processor Model z. Bus Functional Model (BFM) z. Instruction-Set Simulator (ISS) Fall 2005 Design & Co-design of Embedded Systems 29
Instruction-Set Simulator z. ISS: a processor model capable of simulating execution of instructions z. Different types of ISS for different purposes y. Usage 1: Verification of applications written in assembly-code x. For fastest speed: translate target assembly instructions into host processor instructions • Is not cycle-accurate. Specially for pipelined and superscalar architectures Fall 2005 Design & Co-design of Embedded Systems 30
ISS (cont’d) z Different types of ISS … (cont’d) y. Usage 2: Verification of timing and interface between system components x. Used in conjunction with a BFM x. ISS should be timing-accurate in this usage • ISS often works as an emulator • For performance estimation usage, ISS is to provide accurate cycle-counting • To have certain speed improvements, ISS should provide necessary hooks (discussed later) Fall 2005 Design & Co-design of Embedded Systems 31
Integrating an ISS and a BFM z. ISS + BFM => complete processor model z. Cycle-accurate ISS + (already cycle-accurate) BFM => cycle-accurate processor model z. Typical units of an ISS y. Fetch, Decode, Execute y. Execute unit performs calls to BFM to access memory or configuration registers y. Fetch unit performs calls to BFM to read instructions Fall 2005 Design & Co-design of Embedded Systems 32
Integrating an ISS and a BFM (cont’d) z. For more complex architectures (pipelined, superscalar) y. Other units must be modeled x. Cache, prefetch, re-order buffer, issue, … x. Many units may need to call BFM functions z. ISS may need to provide BFM with certain memory-access functions (discussed later) Fall 2005 Design & Co-design of Embedded Systems 33
Techniques to speedup simulation z Reduce activity on memory bus y. Most applications: 95% of memory traffic is attributed to instruction and data fetches y. Memory access previously verified? => no need to simulate it again during co-simulation x. Put instruction memory (and/or data memory) inside ISS x. What to do for external devices accessing instr. /data memory? • BFM must be configured to recognize them and call corresponding ISS method to access instr/data • ISS must provide the above methods • ISS must implement a memory map, where certain addresses are directly accessed, while others through bus cycles Fall 2005 Design & Co-design of Embedded Systems 34
Techniques to speedup simulation (cont’d) z. Turn off clocks on modules y. All clocked components activate by clock edge x. Most of time the component is not addressed => activation and simulation (even a limited part of each process) is wasteful => turn off clocks when not necessary y. How to do it? x. BFM generated bus clock, only when devices on the bus are addressed Fall 2005 Design & Co-design of Embedded Systems 35
Methodology for HW/SW Co-verification in System. C Implementation: 8051 Fall 2005 Design & Co-design of Embedded Systems
Implementation: 8051 z. Implementation of Synopsys dw 8051 BFM and cycle-accurate ISS y. Synopsys dw 8051: x 8 -bit microcontroller x. Configurable, fully synthesizable, reusable macrocell xindustry standard for simple embedded application • smartcard, cars, toys, … Fall 2005 x. Many IO modes x. SFR (Specific Function Register) bus xinterrupt ports (expandable to 12) xup to 2 serial ports, in 4 different modes of operation & Co-design of xup to 2 timers, in. Design 3 different modes of operation Embedded Systems 37
Implementation: 8051 (cont’d) zdw 8051 BFM y. Fully developed in System. C y. BFM supports xtimer 1, mode 0, 1, 2 xserial port 0, mode 0, 1, 2, 3 xexternal interrupts xexternal memory accesses x. SFR accesses zdw 8051 cycle-accurate model Fall 2005 Design & Co-design of Embedded Systems 38
Experimental Results (BFM) File 8051 BFM Lines of C++ code 1944 Implementation System. C Co-simulation Fall 2005 HW 497 test_sw 1134 Testbench (sim. time) Memory SFRs Serial Timer 274 907 438 1051 405 561 449 451 Design & Co-design of Embedded Systems 39
Experimental Results (Cycle-accurate Model) Implementation Simulation time ISS + BFM Optimized ISS + BFM C/C++ + BFM 4708 279 252 Fall 2005 Design & Co-design of Embedded Systems 40
What we learned today z. Ghosh et al co-verification strategy, using System. C, was presented y. C/C ++models are very efficiently compiled on today architectures y. No overhead for C-HDL interfacing is introduced y. Performance estimates can be obtained from model y. C ++allows use of OO techniques to create BFM and ISS, which enables re-use of them for subsequent generations of the processor Fall 2005 Design & Co-design of Embedded Systems 41
- Slides: 41