Design Codesign of Embedded Systems The ODYSSEY Methodology
Design & Co-design of Embedded Systems The ODYSSEY Methodology: ASIP-Based Design of Embedded Systems from Object -Oriented System-Level Models Maziar Goudarzi 1
Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 2
Embedded Systems Market • Rapidly growing market – Compound Annual Growth Rate (CAGR) of 17. 3% The future of computing resides in embedded computing 3
Market Life Cycle • A delay to the market window causes a huge revenue impact Source: Agilent Technologies 4
Motivation: Design Automation • Conclusion: – Design Automation Tools & Methodologies are needed for Embedded System Design • Question: – At what level of abstraction? 5
The Design Productivity Gap 6
Motivation: Electronic System-Level (ESL) design • Solution: – Raise the level of abstraction • Historical examples: – Place & route tools – Hardware description languages – Hardware synthesis • Latest suggestion: Source: Monterey Design Systems – ESL • Spans SW+HW 7
Motivation • Conclusion: – The embedded system industry is in need of ESL Design Methodologies and supporting Design Automation Tools • Question: – How to specify, implement, and validate the embedded system? 8
The First Challenge in ESL: Specification • Alternatives: – Extend HW modeling (e. g. VHDL) to SW – Extend SW modeling (e. g. Java) to HW – Use HW/SW-neutral or mathematical models (e. g. Codesign FSM) • Observations: – Software accounts for 80% of embedded system development cost [ITRS-2003] – Technology trend toward SW: • Catapult (Mentor Co. ) • Agility Compiler & DK Design Suite (Celoxica Co. ) • Cascade (Critical. Blue Co. ) 9
ESL Challenges (cont’d) • Conclusion: – Object-oriented design methodology is a reasonable answer • Questions: – What about other ESL challenges? • Implementation, verification, automation of the design • To be discussed later in the talk… 10
Thesis of this work There is scope to raise the abstraction-level of processors when designing embedded systems, and furthermore, such raise helps to address modelling, implementation, and reuse challenges in the design and designautomation of modern embedded systems. 11
Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 12
Related Work • OO used for hardware Modeling – Extensions of VHDL • Myriads of different proposals – Objective-VHDL, several flavours of OO-VHDL, SUAVE • Just a few consider synthesis – Java • HW components viewed as objects • Signals travelling among components viewed as objects – C++ • System. C • Cyn. Lib from Cyn. Apps 13
Related Work (cont’d) • OO used for hardware modeling (cont’d) – Modeling is good, but synthesis is the major concern • Major approaches to OO synthesis – – ODETTE OASE Enodia® Architecture Not in our area of work: • Wolf’s OO Co-synthesis • Matisse • j. HISC 14
The ODETTE Approach • ODETTE proposal: – View objects are Finite-State Machines (FSM) – Object attributes: FSM state variables – Method calls: FSM state transitions 15
The ODETTE Object FSM 16
Polymorphism in ODETTE 17
Analysis of ODETTE • Nice, but very high overhead – One FSM per object => High area and power overhead: O(no) – Polymorphism: Replication => High area and power overhead +Maximum potential concurrency – (Apparently) FSM => sequential method-call inside objects • Q: What if a method calls another one? • Q: How to extend to HW/SW systems? 18
The OASE Approach • OASE Proposal: – Reuse and customize behavioural synthesis techniques – Static analysis & transformation of the OO code – Converts OO constructs to non-OO ones • Access to object attributes • Non-virtual method calls • Virtual method calls (polymorphism) 19
OASE Transformation Process e Source Syntax Tree Scanner / Parser Semantic Analysis Control Flow Analysis Data Flow Analysis Concurrency Analysis Symbol Tables Control Flow Graph Output of Intermediate Format Verilog The transformation process from ‘e’ to Verilog [Kuhn et al. , DAC’ 01] 20
Polymorphism in OASE S 1 S 2 S 3 Object Reference variable Set x S 1, S 2 y S 2 z S 1, S 2, S 3 Results of static analysis switch (z) { case S 1: S 1_foo(); case S 2: S 2_foo(); case S 3: S 3_foo(); } An example in e language [Kuhn et al. , DAC’ 01] 21
Analysis of OASE • Nice extension of behavioural synthesis to OO, but still high overhead for polymorphism – Area/power overhead: O(no nmc) 22
The Enodia® Architecture • Silicon Infusion Co. (UK startup) • Enodia Proposal: – Bottom-up composition of a variety of their IP cores – An Object-Orientated So. C architecture • Patented in UK and US 23
Enodia® E 9610 product Internal architecture of Enodia E 9610 chip [Silicon Infusion Co. , 2004] 24
Analysis of Enodia® • Patent on high-performance caching • Chip architecture very similar to ours, but – uses firmware for polymorphism => performance overhead – Bottom-up approach => one manual chip design per application domain 25
Summary & Comparison ODETTE OASE Enodia Impl. Style ASIC Heterogeneous Multiprocessor Synthesis Approach Per-object method replication Static analysis + inlining Multiple objects per method impl. Language Objective-VHDL, System. C-Plus Java, System. C, e N. A. Optimization Dead-code removal object reachability N. A. Polymorphism Method replication & multiplexing Method inlining Firmware 26
Summary & Comparison (cont’d) ODETTE OASE Enodia HW-SW? Not provided Stub generation SW on multiprocessor Model of Concurrency Dynamic (de)allocation Objects invoked from processes Multiple processes in modules N. A. Not supported Supported 27
Summary & Comparison (cont’d) • Major shortcomings 1. 2. 3. 4. 5. Viewing objects as structural components Too verbose languages Unacceptable area/power overhead No or unclear path toward HW-SW system HW designers’ reluctance to OO – Object-oriented Design and s. Ynthe. Si. S of Embedded s. Ystems • We propose ODYSSEY 28
Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 29
ASIP vs. ASIC Source: K. Keutzer, S. Malik R. Newton, “From ASIC to ASIP: The Next Design Discontinuity”, ICCD, 2002 Application-specific instruction-processors (ASIPs) are replacing ASICs 30
OO-ASIP: Object-Oriented ASIP • Our proposal: – Let methods of a class library be the instruction-set of a processor The class library A i: int f() g() B c: char f() h() C f: float a 1 Data Memory The OO-ASIP b 1. h() b 1 a 2. g() a 2 ap->f() g() k() Instruction Memory 31
OO-ASIP vs. Traditional Processors • OO-ASIP for int/float = a traditional processor • Differentiating features – OO-ASIP instructions can call one another – OO-ASIP instructions can be implemented in software as well as in hardware – Big instructions ÞIndependent execution units for each HW instruction ÞDynamic power management by de-activating not-running instructions & Dynamic area management by caching most-recently-run instructions – OO-ASIP implements polymorphism in hardware 32
OO-ASIP vs. Other ASIPs • Typical ASIP-design flow Applications and Design Constraints Application Analysis Architectural Design-Space Exploration Instruction-set generation Code Synthesis Hardware Synthesis Object code Code Source: M. K. Jain, M. Balakrishnan, A. Kumar, “ASIP Design Methodologies: Survey and Issues”, VLSI-Design Conf. , 2001. • Disadvantage – No guarantee to suit future different (but related) applications • OO-ASIP: future related apps. shall use today class lib. 33
Design-Space Represented by OO-ASIP Given an OO application with No objects Implementation by a traditional processor Number of objects per OO-ASIP No OO-ASIP 2 1 All HW ODETTE implementation All SW Style of methods (HW or SW) 34
Design Flow using OO-ASIPs OO-ASIP Design Flow OO-ASIP Reuse Flow Disciplined Benchmarking (OO-ASIP, HW Class Lib. ) Choose suitable class lib. Database Hardware Class lib. HW class lib. Model+verify the App. OO-ASIP Synthesis The OO-ASIP Data memory OO-ASIP Compile toward the ASIP Instr. memory 35
Design Flow using OO-ASIP: Another View Application SW Model Software C++ ASIP ISA: Hardware In Obj sta ec nti t ati ASIP Programming on Path f, g, k System. C (C++) ASIP Synthesis Path Hardware Class Lib. D DD Software Class Lib. A f() h() f() g() h() B k() BB C System Class Lib. ASIP Hardware 36
Programming the OO-ASIP • Requirements on the OO-ASIP compiler – Retargetable to various OO-ASIPs – Retargetable to various processor cores – Capable of early hardware-software co-validation • Our solution: – Source-to-source transformation 37
38
The ODYSSEY Ultimate Goal • The ODYSSEY target chip: FPGA-like array of OO-ASIPs • Interconnection: – Packet-routing network – Motivation: • Network-on-Chip viewed as future paradigm in DSM technologies ODYSSEY System-Synthesizer On-Chip network of OO-ASIPs OO-ASIP 1 router OO-ASIP 2 OO-ASIP 3 router Processor OO-ASIP 4 router Processor 39
Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 40
A Simple OO-ASIP Architecture Functional Units (FUs) Implementation A Traditional Processor B: : f() routine B f() g() f() h() of A: : f() To Data Memory Implementation Object Management Unit (OMU) of A: : g() Implementation of B: : h() The OO-ASIP Method Invocation Unit (MIU) From Instruction Memory VMT OTT 41
Case Study 1: Traffic-Light Controller traffic_light status: int elapsed_time: int open() close() timekeeper() farmroad_light highway_light fixed_green: int min_green: int open() close() All methods implemented in hardware 42
Case Study 1: Traffic-Light Controller Values reported by Leonardo. Spectrum tool over a sample 0. 5 um process 43
Case Study 1: Traffic-Light Controller 15% reduction 20% reduction Values estimated by Synopsys Power. Compiler tool over a 1 um process with 5 V operating voltage 44
Analysis of the Architecture • Area/Power management – Static (application-specific) policy – Dynamic (application-independent) policy • Polymorphism overhead – Performance improved by HW MIU – Area/power overhead still present 45
Our Solution: Network-on-Chip Architecture • Dispatch virtual-methods at the same time that packets are routed on an on-chip network Processor Object Management Unit (OMU) The OO-ASIP On-chip Network To Data Memory A: : f() A: : g() B: : h() Functional Units (FUs) From Instruction Memory A B f() g() f() h() 46
No. C: Network-on-Chip • No. C emergence: – Fully synchronous designs not feasible anymore – Unreliable communication in very deep submicron technologies (90 nm and beyond) – Solution: leverage computer networks and protocols for communication inside chips – No. C seems unavoidable Reference: L. Benini, G. De. Micheli, “Networks on Chips: a New So. C Paradigm, ” 47 IEEE Computer, 35(1): 70 -78, 2002.
Ordinary-Method Dispatch by Network Routing • FU-identifier: FU=<method. class> • Object-identifier: object=<class. num> • Method call = invoke a method on an object <method. object> = <method. <class. num>> = <<method. class>. num> = <FU. num> = Packet destined to the node addressed FU 48
Virtual-Method Dispatch by Network Routing • To dynamically bind a method call (e. g. objp->method(params) in C++) 1. Assemble a packet as <method, objp, params> 2. Send it over the on-chip network 3. The (probable) return value is sent back as another packet 49
Case Study 2: A Codec Engine data_block data[20]: byte Hardware methods Software methods print() encode() decode() xor_encoded_data swap_encoded_data cypher: byte convert_char(byte) encode() decode() swap(byte, byte) encode() decode() 50
Case Study 2: Implementation in System. C 51
Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 52
Input-Output Correspondence Class definition attributes HW-methods SW-methods main() function System Model (C++) The OO-ASIP Object-Management Unit (OMU) Processor Module thread__main() HW-method implementation SW-method implementation on-chip network System Implementation (System. C) 53
Big Picture of Tool Flow OO-ASIP System Model (C++) Synthesis HW-method Transformations Parsing + Analysis Partitioning HW-structure generator System-level Synthesis OO-ASIP Compilation SW-method Transformations SW-structure generator Hardware (System. C) Instr-set extenstions Software (C++) System. C Synthesis Traditional Processor C++ Compiler Gate-level HW Binary SW Final System Downstream Synthesis 54
HW-SW Co-simulation Model HW-method Transformations System Model (C++) Parsing + Analysis Partitioning HW-structure generator Co-simulation model System-level Synthesis SW-method Transformations SW-structure generator Hardware (System. C) Instr-set extenstions Software (C++) System. C Synthesis Traditional Processor C++ Compiler Gate-level HW Binary SW Final System Downstream Synthesis 55
Experiments on Co-simulation Performance* * All experiments done on a Celeron 2. 0 GHz processor with 256 MB of RAM ** Worst-case assumed: All methods are implemented in hardware 56
Analysis of Experimental Results • High MC/sec. = High Communication/Computation ratio = Most of the time spent in comm. instead of comp. = Potentially low performance in final implementation • Conclusion: – Low co-simulation performance ~ Potentially low final performance => Hint to the designer: Decrease comm. /comp. time (e. g. by combining methods) 57
Outline • Motivation • Related Work • ODYSSEY: Theory • ODYSSEY: Implementation • ODYSSEY: Design Automation • Summary and Conclusion 58
Summary • An ESL design methodology for embedded systems was – – – developed implemented automated – – – The design methodology The raise in abstraction-level of processor ISA The OO-ASIP processor • The main thrusts: 59
Further Research • Currently going-on: – Case studies on real-life industrial apps. • JPEG codec (Morteza Najaf. Vand) • MPEG decoder (Naser Mohammad. Zadeh) – Object-aware cache – – – • Application-specific data prefetching in hardware (Mehdi Modarressi) Synthesis of a Multiprocessor OO-ASIP (Hani Javan. Hemmat) RT-Level co-simulation (Ms. Zeinolabedini) Using IP-Cores in OO-ASIPs (Ms. Hashemi) Fault-Tolerance by software standby sparing Assertion-based verification • A few others – – – Application-specific memory synthesis for OO-ASIP Fault-tolerance by dynamic reconfiguration using polymorphism Multithreaded OO-ASIP 60
Conclusion There is scope to raise the abstraction-level of processors when designing embedded systems, and furthermore, such raise helps to address modelling, implementation, and reuse challenges in the design and designautomation of modern embedded systems. 61
Supplementary Material 62
Supplements • FDL’ 03 Poster • Presentation at Oldenburg • Progress Report 1 at Department of High- Tech. Industries, Ministry of Industries and Mines 63
- Slides: 63