MSc Microprocessors Dr Konstantinos Tatas com tkfit ac

  • Slides: 92
Download presentation
MSc - Microprocessors Dr. Konstantinos Tatas com. tk@fit. ac. cy 1

MSc - Microprocessors Dr. Konstantinos Tatas com. tk@fit. ac. cy 1

Useful Information n Instructor: Lecturer K. Tatas – Office hours: TBA – E-mail: com.

Useful Information n Instructor: Lecturer K. Tatas – Office hours: TBA – E-mail: com. tk@fit. ac. cy – http: //staff. fit. ac. cy/com. tk Lecture periods/week: 3 n Duration: 10 weeks n ECTS: 7 (175 hours) n 2

Course Objectives n By the end of the course students should be able to:

Course Objectives n By the end of the course students should be able to: – Evaluate the complex trade-offs involved in embedded system design – Write detailed embedded system requirements and specification documents – Write executable specifications using UML/System. C – Develop applications using ARM Developer Suite – Write efficient ARM assembly and C programs in ARM and Thumb mode – Analyze program performance using traces – Use code transformations to improve performance/code size/power consumption. 3

Course Outline (1/2) n n n Week 1: Introduction to embedded systems – Embedded

Course Outline (1/2) n n n Week 1: Introduction to embedded systems – Embedded microprocessor evolution – Design metrics and constraints (performance, power, cost, time-to-market) and design optimization challenges - Distributed and Real-time systems Week 2: Key embedded system technologies – Integrated Circuit technology – Microprocessor technology – CAD tool technology – Sensor technology Week 3: Embedded system specification and modeling – Objectoriented specification (UML/C++/System. C) – Assignment 1 Week 4: Computer Architecture – Instruction sets – RISC vs. CISC – pipelining - The ARM microprocessor architecture - ARM assembly – ARM mode – Thumb mode - ARM and Thumb instruction set ARM conditional execution Week 5: Processor I/O – Serial I/O – Busy/wait I/O – Interrupts – Exceptions – Traps – ARM memory mapped I/O - Caches – Memory Management Units – Protection Units – ARM cache and MMU – Assignment 2 4

Course Outline (2/2) n n n Week 6: Assignment 1 Week 7: Programme design

Course Outline (2/2) n n n Week 6: Assignment 1 Week 7: Programme design and analysis – DFGs – Compilers – Assemblers – Linkers – Basic compiler optimizations/code transformations – Measuring programme speed – Trace-driven performance analysis – Energy optimization – programme size optimization Week 8: Code transformations – Loop unrolling – loop merging – loop tiling – performance optimizing transformations Week 9: Test Week 10: Assignment 2 5

Course Assessment Final exam: 40% n Coursework: 60% n – Assignment 1: 15% –

Course Assessment Final exam: 40% n Coursework: 60% n – Assignment 1: 15% – Assignment 2: 15% – Quizzes: 10% – Test: 10% – Lab exercises: 10% 6

References n Books – W. Wolf, “Computers as Components” – W. Wolf, “High-Performance Embedded

References n Books – W. Wolf, “Computers as Components” – W. Wolf, “High-Performance Embedded Computing” – H. Kopetz, “Real-Time Systems: Design Principles for Distributed Embedded Applications” – S. Furber, “ARM System-on-Chip Architecture” – P. Panda, “Memory Issues in Embedded Systems-on -Chip” – F. Vahid and T. Givargis, “Embedded System Design: A Unified Hardware/Software Introduction” – F. Catthoor, “Data Access and Storage Management for Embedded Programmable Processors” 7

Microprocessors for Embedded systems n n Computing systems are everywhere Most of us think

Microprocessors for Embedded systems n n Computing systems are everywhere Most of us think of “desktop” computers – – n PC’s Laptops Mainframes Servers But there’s another type of computing system – Far more common. . . 8

Embedded systems overview n Embedded computing systems Computers are in here. . . –

Embedded systems overview n Embedded computing systems Computers are in here. . . – Computing systems embedded within electronic devices – Hard to define. Nearly any computing system other than a desktop computer – Billions of units produced yearly, versus millions of desktop units – Perhaps 50 per household and per automobile and here. . . and even here. . . Lots more of these, though they cost a lot less each. 9

A “short list” of embedded systems Anti-lock brakes Auto-focus cameras Automatic teller machines Automatic

A “short list” of embedded systems Anti-lock brakes Auto-focus cameras Automatic teller machines Automatic toll systems Automatic transmission Avionic systems Battery chargers Camcorders Cell phones Cell-phone base stations Cordless phones Cruise control Curbside check-in systems Digital cameras Disk drives Electronic card readers Electronic instruments Electronic toys/games Factory control Fax machines Fingerprint identifiers Home security systems Life-support systems Medical testing systems Modems MPEG decoders Network cards Network switches/routers On-board navigation Pagers Photocopiers Point-of-sale systems Portable video games Printers Satellite phones Scanners Smart ovens/dishwashers Speech recognizers Stereo systems Teleconferencing systems Televisions Temperature controllers Theft tracking systems TV set-top boxes VCR’s, DVD players Video game consoles Video phones Washers and dryers And the list goes on and on 10

Some common characteristics of embedded systems n Single-functioned – Executes a single program, repeatedly

Some common characteristics of embedded systems n Single-functioned – Executes a single program, repeatedly n Tightly-constrained – Low cost, low power, small, fast, etc. n Reactive and real-time – Continually reacts to changes in the system’s environment – Must compute certain results in real-time without delay 11

An embedded system example – Digital camera CCD Digital camera chip A 2 D

An embedded system example – Digital camera CCD Digital camera chip A 2 D CCD preprocessor Pixel coprocessor D 2 A lens JPEG codec Microcontroller Multiplier/Accum DMA controller Memory controller n n n Display ctrl ISA bus interface UART LCD ctrl Single-functioned -- always a digital camera Tightly-constrained -- Low cost, low power, small, fast Reactive and real-time -- only to a small extent 12

Embedded Software Development Requires as Much/More Design Effort Than Hardware 13

Embedded Software Development Requires as Much/More Design Effort Than Hardware 13

A System-on-a-Chip: Example Courtesy: Philips 14

A System-on-a-Chip: Example Courtesy: Philips 14

Design at a crossroad Multi- 500 k Gates FPGA Spectral + 1 Gbit DRAM

Design at a crossroad Multi- 500 k Gates FPGA Spectral + 1 Gbit DRAM Imager Preprocessing 64 SIMD Processor Array + SRAM Image Conditioning 100 GOPS Analog System-on-a-Chip m. C system +2 Gbit DRAM Recognition n n Embedded applications where cost, performance, and energy are the real issues! DSP and control intensive Mixed-mode Combines programmable and application-specific modules Software plays crucial role 15

Disciplines involved in Embedded System Design n n n n Digital System Design Software

Disciplines involved in Embedded System Design n n n n Digital System Design Software Design Analog/Mixed-Signal/RF System Design Operating Systems Microprocessors/Computer Architecture Verification Testing etc 16

Languages traditionally used in Embedded System Design n Specification/modeling – – – n n

Languages traditionally used in Embedded System Design n Specification/modeling – – – n n – – – UML SDL C/C++ Hardware design – VHDL – Verilog Software design n C/C++ Java Assembly Verification – – VHDL/Verilog System. Verilog Tcl/tk Vera 17

Design challenge – optimizing design metrics n Obvious design goal: – Construct an implementation

Design challenge – optimizing design metrics n Obvious design goal: – Construct an implementation with desired functionality n Key design challenge: – Simultaneously optimize numerous design metrics n Design metric – A measurable feature of a system’s implementation – Optimizing design metrics is a key challenge 18

Design challenge – optimizing design metrics n Common metrics – Unit cost: the monetary

Design challenge – optimizing design metrics n Common metrics – Unit cost: the monetary cost of manufacturing each copy of the system, excluding NRE cost – NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of designing the system – Size: the physical space required by the system – Performance: the execution time or throughput of the system – Power: the amount of power consumed by the system – Flexibility: the ability to change the functionality of the system without incurring heavy NRE cost 19

Design challenge – optimizing design metrics n Common metrics (continued) – Time-to-prototype: the time

Design challenge – optimizing design metrics n Common metrics (continued) – Time-to-prototype: the time needed to build a working version of the system – Time-to-market: the time required to develop a system to the point that it can be released and sold to customers – Maintainability: the ability to modify the system after its initial release – Correctness, safety, many more 20

Design metric competition -- improving one may worsen others Power n Performance Size NRE

Design metric competition -- improving one may worsen others Power n Performance Size NRE cost CCD Digital camera chip A 2 D CCD preprocessor Pixel coprocessor D 2 A lens JPEG codec Microcontroller Multiplier/Accum DMA controller Memory controller Display ctrl ISA bus interface UART LCD ctrl Expertise with both software and hardware is needed to optimize design metrics – Not just a hardware or software expert, as is common – A designer must be comfortable with various technologies in order to choose the best for a given application and constraints 21

Time-to-market: a demanding design metric Revenues ($) n n Time (months) n n Time

Time-to-market: a demanding design metric Revenues ($) n n Time (months) n n Time required to develop a product to the point it can be sold to customers Market window – Period during which the product would have highest sales Average time-tomarket constraint is about 8 months 22 Delays can be costly

Losses due to delayed market entry Revenues ($) Peak revenue On-time Peak revenue from

Losses due to delayed market entry Revenues ($) Peak revenue On-time Peak revenue from delayed entry Market fall Market rise Delayed D On-time entry n Delayed entry W 2 W Time n Simplified revenue model – Product life = 2 W, peak at W – Time of market entry defines a triangle, representing market penetration – Triangle area equals revenue Loss – The difference between the on-time and delayed triangle areas 23

Revenues ($) Losses due to delayed market entry (cont. ) n Peak revenue On-time

Revenues ($) Losses due to delayed market entry (cont. ) n Peak revenue On-time n Market fall Market rise Delayed D On-time entry Peak revenue from delayed entry Delayed entry n W 2 W Time – – – Area = 1/2 * base * height – On-time = 1/2 * 2 W * W – Delayed = 1/2 * (WD+W)*(W-D) Percentage revenue loss = (D(3 W-D)/2 W 2)*100% Try some examples Lifetime 2 W=52 wks, delay D=4 wks (4*(3*26 – 4)/2*26^2) = 22% Lifetime 2 W=52 wks, delay D=10 wks (10*(3*26 – 10)/2*26^2) = 50% Delays are costly! 24

The performance design metric n n Widely-used measure of system, widely-abused – Clock frequency,

The performance design metric n n Widely-used measure of system, widely-abused – Clock frequency, instructions per second – not good measures – Digital camera example – a user cares about how fast it processes images, not clock speed or instructions per second Latency (response time) – Time between task start and end – e. g. , Camera’s A and B process images in 0. 25 seconds Throughput – Tasks per second, e. g. Camera A processes 4 images per second – Throughput can be more than latency seems to imply due to concurrency, e. g. Camera B may process 8 images per second (by capturing a new image while previous image is being stored). Speedup of B over S = B’s performance / A’s performance 25 – Throughput speedup = 8/4 = 2

Three key embedded system technologies n Technology – A manner of accomplishing a task,

Three key embedded system technologies n Technology – A manner of accomplishing a task, especially using technical processes, methods, or knowledge n Three key technologies for embedded systems – Processor technology – IC technology – Design technology 26

Processor technology n n The architecture of the computation engine used to implement a

Processor technology n n The architecture of the computation engine used to implement a system’s desired functionality Processor does not have to be programmable – “Processor” not equal to general-purpose processor Controller Datapath Control logic and State register Register file Control logic and State register Registers Control logic index State register + IR PC General ALU IR Custom ALU PC Data memory Program memory Assembly code for: Data memory total = 0 for i =1 to … General-purpose (“software”) total Data memory Program memory Assembly code for: total = 0 for i =1 to … Application-specific 27 Single-purpose (“hardware”)

Processor technology n Processors vary in their customization for the problem at hand total

Processor technology n Processors vary in their customization for the problem at hand total = 0 Desired functionality General-purpose processor Application-specific processor for i = 1 to N loop total += M[i] end loop Single-purpose processor 28

General-purpose processors n n Programmable device used in a variety of applications – Also

General-purpose processors n n Programmable device used in a variety of applications – Also known as “microprocessor” Features – Program memory – General datapath with large register file and general ALU User benefits – Low time-to-market and NRE costs – High flexibility “Pentium” the most well-known, but there are hundreds of others Controller Datapath Control logic and State register Register file IR PC Program memory General ALU Data memory Assembly code for: total = 0 for i =1 to … 29

Single-purpose processors n Digital circuit designed to execute exactly one program – a. k.

Single-purpose processors n Digital circuit designed to execute exactly one program – a. k. a. coprocessor, accelerator or peripheral n Datapath Control logic index total State register + Features – Contains only the components needed to execute a single program – No program memory n Controller Data memory Benefits – Fast – Low power – Small size 30

Application-specific processors n Programmable processor optimized for a particular class of applications having common

Application-specific processors n Programmable processor optimized for a particular class of applications having common characteristics Controller Datapath Control logic and State register Registers Custom ALU IR PC – Compromise between general-purpose Program memory and single-purpose processors n Features – – – n Program memory Optimized datapath Special functional units Data memory Assembly code for: total = 0 for i =1 to … Benefits – Some flexibility, good performance, size and power 31

IC technology n The manner in which a digital (gate-level) implementation is mapped onto

IC technology n The manner in which a digital (gate-level) implementation is mapped onto an IC – IC: Integrated circuit, or “chip” – IC technologies differ in their customization to a design – IC’s consist of numerous layers (perhaps 10 or more) n IC technologies differ with respect to who builds each layer and when gate IC package oxide IC source channel drain Silicon substrate 32

IC technology Design Approaches IC Technology Implementation Approaches Custom Semicustom Cell-based Standard Cells Compiled

IC technology Design Approaches IC Technology Implementation Approaches Custom Semicustom Cell-based Standard Cells Compiled Cells Array-based Macro Cells Pre-diffused (Gate Arrays) Pre-wired (FPGA's) 33

Full-custom design n All layers are optimized for an embedded system’s particular digital implementation

Full-custom design n All layers are optimized for an embedded system’s particular digital implementation – – – n Placing transistors Sizing transistors Routing wires Benefits – Excellent performance, small size, low power n Drawbacks – High NRE cost (e. g. , $300 k), long time-to-market 34

The Custom Approach Intel 4004 35 Courtesy Intel

The Custom Approach Intel 4004 35 Courtesy Intel

Transition to Automation and Regular Structures Intel 4004 (‘ 71) Intel 8286 Intel 8080

Transition to Automation and Regular Structures Intel 4004 (‘ 71) Intel 8286 Intel 8080 Intel 8085 36 Courtesy Intel 8486

37

37

IC technology Design Approaches IC Technology Implementation Approaches Custom Semicustom Cell-based Standard Cells Compiled

IC technology Design Approaches IC Technology Implementation Approaches Custom Semicustom Cell-based Standard Cells Compiled Cells Array-based Macro Cells Pre-diffused (Gate Arrays) Pre-wired (FPGA's) 38

Semi-custom n Lower layers are fully or partially built – Designers are left with

Semi-custom n Lower layers are fully or partially built – Designers are left with routing of wires and maybe placing some blocks n Benefits – Good performance, good size, less NRE cost than a full-custom implementation (perhaps $10 k to $100 k) n Drawbacks – Still require weeks to months to develop 39

Cell-based Design (or standard cells) Routing channel requirements are reduced by presence of more

Cell-based Design (or standard cells) Routing channel requirements are reduced by presence of more interconnect layers 40

Standard Cell — Example [Brodersen 92] 41

Standard Cell — Example [Brodersen 92] 41

Standard Cell - Example 3 -input NAND cell (from ST Microelectronics): C = Load

Standard Cell - Example 3 -input NAND cell (from ST Microelectronics): C = Load capacitance T = input rise/fall time 42

IC technology Design Approaches IC Technology Implementation Approaches Custom Semicustom Cell-based Standard Cells Compiled

IC technology Design Approaches IC Technology Implementation Approaches Custom Semicustom Cell-based Standard Cells Compiled Cells Array-based Macro Cells Pre-diffused (Gate Arrays) Pre-wired (FPGA's) 43

Programmable Logic Devices n All layers (diffusion, polysilicon, [multi-] metal) may exist – Designers

Programmable Logic Devices n All layers (diffusion, polysilicon, [multi-] metal) may exist – Designers can purchase an IC – Connections on the IC are either created or destroyed to implement desired functionality – Field-Programmable Gate Array (FPGA) and recently Gate Arrays are very popular n Benefits – Low NRE costs, almost instant IC availability n Drawbacks – Bigger, expensive (perhaps $30 per unit), power hungry, slower 44

Gate Array — Sea-ofgates Uncommited Cell Committed Cell (4 -input NOR) 45

Gate Array — Sea-ofgates Uncommited Cell Committed Cell (4 -input NOR) 45

Sea-of-gate Primitive Cells Using oxide-isolation Using gate-isolation 46

Sea-of-gate Primitive Cells Using oxide-isolation Using gate-isolation 46

Sea-of-gates Random Logic Memory Subsystem LSI Logic LEA 300 K (0. 6 mm CMOS)47

Sea-of-gates Random Logic Memory Subsystem LSI Logic LEA 300 K (0. 6 mm CMOS)47 47

Prewired Arrays Classification of prewired arrays (or fieldprogrammable devices): n Based on Programming Technique

Prewired Arrays Classification of prewired arrays (or fieldprogrammable devices): n Based on Programming Technique – Fuse-based (program-once) – Non-volatile EPROM based – RAM based n Programmable Logic Style – Array-Based – Look-up Table n Programmable Interconnect Style – Channel-routing – Mesh networks 48

Altera MAX 49 From Smith 97

Altera MAX 49 From Smith 97

Altera MAX Interconnect Architecture column channel row channel t PIA LAB 1 LAB 2

Altera MAX Interconnect Architecture column channel row channel t PIA LAB 1 LAB 2 LAB PIA t PIA LAB 6 Array-based (MAX 3000 -7000) Mesh-based (MAX 9000) 50

LUT-Based Logic Cell 4 C 1. . C 4 xx D 4 D 3

LUT-Based Logic Cell 4 C 1. . C 4 xx D 4 D 3 D 2 Logic function of xxx D 1 F 3 F 2 F 1 xxxx Bits control xx xx Logic functionx of xxx F 4 xxxx Logic function of xxx x xxxxx Xilinx 4000 Series xxxx xx x x Bits control xx xx xxxx xx xx H P x Multiplexer Controlled by Configuration Program x 51

Array-Based Programmable Wiring Interconnect Point Programmed interconnection Input/output pin Cell Horizontal tracks Vertical tracks

Array-Based Programmable Wiring Interconnect Point Programmed interconnection Input/output pin Cell Horizontal tracks Vertical tracks 52

Transistor Implementation of Mesh 53 Courtesy Dehon and Wawrzyniek

Transistor Implementation of Mesh 53 Courtesy Dehon and Wawrzyniek

RAM-based FPGA Xilinx XC 4000 ex 54

RAM-based FPGA Xilinx XC 4000 ex 54

Design Technology n The manner in which we convert our concept of desired system

Design Technology n The manner in which we convert our concept of desired system functionality into an Compilation/ Libraries/ Test/ Synthesis IP Verification implementation Compilation/Synthesis: Automates exploration and insertion of implementation details for lower level. Libraries/IP: Incorporates predesigned implementation from lower abstraction level into higher level. Test/Verification: Ensures correct functionality at each level, thus reducing costly iterations between levels. System specification System synthesis Hw/Sw/ OS Model simulat. / checkers Behavioral specification Behavior synthesis Cores Hw-Sw cosimulators RT specification RT synthesis RT components HDL simulators Logic specification Logic synthesis Gates/ Cells Gate simulators To final implementation 55

The co-design ladder n n In the past: – Hardware and software design technologies

The co-design ladder n n In the past: – Hardware and software design technologies were very different – Recent maturation of synthesis enables a unified view of hardware and software Hardware/software “codesign” Sequential program code (e. g. , C, VHDL) Compilers (1960's, 1970's) Behavioral synthesis (1990's) Register transfers Assembly instructions RT synthesis (1980's, 1990's) Assemblers, linkers (1950's, 1960's) Logic equations / FSM's Machine instructions Logic synthesis (1970's, 1980's) Logic gates Implementation Microprocessor plus VLSI, ASIC, or PLD program bits: “software” implementation: “hardware” The choice of hardware versus software for a particular function is simply a tradeoff among various design metrics, like performance, power, size, NRE cost, and especially flexibility; there is no fundamental difference between what hardware or software can implement. 56

Independence of processor and IC technologies n Basic tradeoff – General vs. custom –

Independence of processor and IC technologies n Basic tradeoff – General vs. custom – With respect to processor technology or IC technology – The two technologies are independent General, providing improved: Generalpurpose processor ASIP Singlepurpose processor Flexibility Maintainability NRE cost Time- to-prototype Time-to-market Cost (low volume) Customized, providing improved: Power efficiency Performance Size Cost (high volume) PLD Semi-custom Full-custom 57

Design Decision Trade-offs 58

Design Decision Trade-offs 58

Generalised Design Flow 59

Generalised Design Flow 59

Architecture Re. Use n Silicon System Platform – – – n Flexible architecture for

Architecture Re. Use n Silicon System Platform – – – n Flexible architecture for hardware and software Specific (programmable) components Network architecture Software modules Rules and guidelines for design of HW and SW Has been successful in PC’s – Dominance of a few players who specify and control architecture n Application-domain specific (difference in constraints) – – Speed (compute power) Dissipation Costs Real / non-real time data 60

Platform-Based Design n n “Only the consumer gets freedom of choice; designers need freedom

Platform-Based Design n n “Only the consumer gets freedom of choice; designers need freedom from choice” (Orfali, et al, 1996, p. 522) A platform is a restriction on the space of possible implementation choices, providing a well-defined abstraction of the underlying technology for the application developer New platforms will be defined at the architecture-micro-architecture boundary They will be component-based, and will provide a range of choices from structured-custom to fully programmable implementations Key to such approaches is the representation of communication in the platform model Source: R. Newton 61

Platform-based Design – System-on-Chip n n n Use of predefined Intellectual Property (IP) A

Platform-based Design – System-on-Chip n n n Use of predefined Intellectual Property (IP) A platform-based system consists of a RISC processor, memories, busses and a common language Platform-based design poses the problem of partitioning a solution between hardware (HDL) and software (programming processors) 62

Platforms Enable Simplified So. C Design Core n Near Peripherals n Far Peripherals Customer

Platforms Enable Simplified So. C Design Core n Near Peripherals n Far Peripherals Customer demands – Fast turn-around time – Easy access to pre-qualified building blocks – Web enabled Design technology – – – Core platforms ‘Big’ IP Emerging So. C bus standards Embedded software 63 HW/SW co-verification

And Automation of IP Selection & Integration 64

And Automation of IP Selection & Integration 64

Heterogeneous Programmable Platforms FPGA Fabric Embedded memories Embedded Power. Pc Hardwired multipliers Xilinx Vertex-II

Heterogeneous Programmable Platforms FPGA Fabric Embedded memories Embedded Power. Pc Hardwired multipliers Xilinx Vertex-II Pro High-speed I/O 65

Xilinx’s products 66

Xilinx’s products 66

Xilinx’s products 67

Xilinx’s products 67

Comparison of CMOS design methods Design Method NRE Unit Cost Power Dissipation Complexity of

Comparison of CMOS design methods Design Method NRE Unit Cost Power Dissipation Complexity of Implement ation Time-to. Market Performance Flexibility μProcessor /DSP low medium high low low high PLA low medium low FPGA low high medium medium Gate/Array medium low medium Cell Based high low high low Custom Design high low high Very high low Platform Based high Low/mediu m low high Medium/l ow high medium 68

None 100 -1000 10 -100 1 -10 Somewhat flexible Embedded microprocessor Domain-specific processor (e.

None 100 -1000 10 -100 1 -10 Somewhat flexible Embedded microprocessor Domain-specific processor (e. g. DSP) Configurable/Parameterizable Hardwired custom Energy Efficiency (in MOPS/m. W) Impact of Implementation Choices 0. 1 -1 Fully flexible Flexibility 69 (or application scope)

Design Economics (1) n n The selling price of an IC Stotal=Ctotal/(1 -m), Ctotal

Design Economics (1) n n The selling price of an IC Stotal=Ctotal/(1 -m), Ctotal is manufacturing cost for a single IC, m desired profit margin Costs for produce an IC – Non-recurring engineering costs (NREs) – Recurring engineering costs – Fixed costs 70

Design Economics (2) n Non-recurring engineering costs (NREs) – Engineering design cost – Prototype

Design Economics (2) n Non-recurring engineering costs (NREs) – Engineering design cost – Prototype manufacturing cost n Recurring costs – Process – Package – Test 71

NRE and unit cost metrics n Costs: – Unit cost: the monetary cost of

NRE and unit cost metrics n Costs: – Unit cost: the monetary cost of manufacturing each copy of the system, excluding NRE cost – NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of designing the system – total cost = NRE cost + unit cost * # of units – per-product cost = total cost / # of units = (NRE cost / # of units) + unit cost • Example – NRE=$2000, unit=$100 – For 10 units – total cost = $2000 + 10*$100 = $3000 – per-product cost = $2000/10 + $100 = $300 Amortizing NRE cost over the units results in an additional $200 per unit 72

NRE and unit cost metrics n Compare technologies by costs -- best depends on

NRE and unit cost metrics n Compare technologies by costs -- best depends on quantity – Technology A: NRE=$2, 000, unit=$100 – Technology B: NRE=$30, 000, unit=$30 – Technology C: NRE=$100, 000, unit=$2 • But, must also consider time-to-market 73

Wafer and die cost Die yield: number of good dies/total number of dies 74

Wafer and die cost Die yield: number of good dies/total number of dies 74

Example n Assuming: n Calculate the minimum shelf price of the chip – 20

Example n Assuming: n Calculate the minimum shelf price of the chip – 20 engineers are employed full-time for a year with a $50, 000/year average salary – Additional 200, 000 overhead costs of which 100, 000 for total testing – A wafer cost of $200 per wafer – A $2 packaging cost per chip – 10 dies/wafer – 70% die yield – 98% final test yield – A market for 100, 000 items 75

Design productivity exponential increase 100, 000 100 10 1 Productivity (K) Trans. /Staff –

Design productivity exponential increase 100, 000 100 10 1 Productivity (K) Trans. /Staff – Mo. 10, 000 n 2009 2007 2005 2003 2001 1999 1997 1995 1993 1991 1989 1987 1985 1983 1981 0. 01 Exponential increase over the past few decades 76

The growing designproductivity gap Moore’s Law: Design Productivity Crisis (SRC 1997) Standard cell density

The growing designproductivity gap Moore’s Law: Design Productivity Crisis (SRC 1997) Standard cell density and speed 01 20 03 20 0 5 20 20 7 09 11 13 15 20 0 10 1 0. 1 ed nd te u po Ra m h t Logic Tr. / Chip co w yr Gro / Tr. / S. M. % y 58 lexit mp o C d unde o p om Rate / yr c h t % 1 w 2 ro ity G v x x i t c u Prod xx xx 0. 01 0. 001 x 100, 000 10, 000 100 10 1 x 0. 1 0. 01 09 20 7 0 20 05 20 03 20 01 20 99 19 7 9 19 95 19 3 9 19 91 19 89 19 7 8 19 85 19 83 19 1 8 19 20 Logic Transistor per Chip ( M ) Density (Kgates / mm 2) ASIC clock (MHz) 1, 000 Equivalent Added Complexity Productivity ( K) Trans. /Staff – Mo. Clock Gates 100 Potential Design Complexity and Designer Productivity 10, 000 77

Design productivity gap n n 1981 leading edge chip required 100 designer months –

Design productivity gap n n 1981 leading edge chip required 100 designer months – 10, 000 transistors / 100 transistors/month 2002 leading edge chip requires 30, 000 designer months – 150, 000 / 5000 transistors/month Designer cost increase from $1 M to $300 M While designer productivity has grown at an impressive rate over the past decades, the rate of improvement has not kept pace with chip capacity 78

The mythical man-month The situation is even worse than the productivity gap indicates In

The mythical man-month The situation is even worse than the productivity gap indicates In theory, adding designers to team reduces project completion time In reality, productivity per designer decreases due to complexities of team management and communication In the software community, known as “the mythical man-month” (Brooks 1975) At some point, can actually lengthen project completion time! (“Too many cooks”) n n n n 60000 1 M transistors, 1 designer=50000 trans/month 40000 Each additional 30000 designer reduces for 20000 100 trans/month So 2 designers produce 10000 4900 trans/month each 16 15 Team 16 19 18 23 24 Months until completion 43 Individual 0 10 20 30 Number of designers 40 79

Summary n n Embedded systems are everywhere Key challenge: optimization of design metrics –

Summary n n Embedded systems are everywhere Key challenge: optimization of design metrics – Design metrics compete with one another n n A unified view of hardware and software is necessary to improve productivity Three key technologies – Processor: general-purpose, application-specific, singlepurpose – IC: Full-custom, semi-custom, PLD – Design: Compilation/synthesis, libraries/IP, test/verification 80

Real-time and distributed systems Dr. Konstantinos Tatas 81

Real-time and distributed systems Dr. Konstantinos Tatas 81

What is real-time? Is there any other kind? A real-time computer system is a

What is real-time? Is there any other kind? A real-time computer system is a computer system where the correctness of the system behavior depends not only on the logical results of the computations, but also on the physical time when these results are produced. n By system behavior we mean the sequence of outputs in time of a n 82

Real-time means reactive n n n A real-time computer system must react to stimuli

Real-time means reactive n n n A real-time computer system must react to stimuli from its environment The instant when a result must be produced is called a deadline. If a result has utility even after the deadline has passed, the deadline is classified as soft, otherwise it is firm. If severe consequences could result if a firm deadline is missed, the deadline is called hard. Example: Consider a traffic signal at a road before a railway crossing. If the traffic signal does not change to red before the train arrives, an accident 83 could result.

Reliability n n n The Reliability R(t) of a system is the probability that

Reliability n n n The Reliability R(t) of a system is the probability that a system will provide the specified service until time t, given that the system was operational at the beginning (t-t 0) The probability that a system will fail in a given interval of time is expressed by the failure rate, measured in FITs (Failure In Time). A failure rate of 1 FIT means that the mean time to a failure (MTTF) of a device is 10^9 h, i. e. , one failure occurs in about 115, 000 years. If a system has a constant failure rate of λ failures/h, then the reliability at time t is given by R(t)= exp(-λ(t-to)) MTTF = 1/λ 84

Example n What must be the system failure rate so that 99% of the

Example n What must be the system failure rate so that 99% of the systems in the field work reliably for the first 100, 000 hours? 85

Safety 86

Safety 86

Maintainability 87

Maintainability 87

Name some hard, firm and soft deadline embedded systems 88

Name some hard, firm and soft deadline embedded systems 88

Example n n n n an automotive company produces 2, 000 electronic engine controllers

Example n n n n an automotive company produces 2, 000 electronic engine controllers of a special type. The following design alternatives are discussed (a) Construct the engine control unit as a single SRU with the application software in Read Only Memory (ROM). The production cost of such a unit is $250. In case of an error, the complete unit has to be replaced. (b) Construct the engine control unit such that the software is contained in a ROM that is placed on a socket and can be replaced in case of a software error. The production cost of the unit without the ROM is $248. The cost of the ROM is $5. (c) Construct the engine control unit as a single SRU where the software is loaded in a Flash EPROM that can be reloaded. The production cost of such a unit is $255. The labor cost of repair is assumed to be $50 for each vehicle. (It is assumed to be the same for each one of the three alternatives). Calculate the cost of a software error for each one of the three alternative designs if 300, 000 cars have to be recalled because of the software error (example in Sect. 1. 6. 1). Which one is the lowest cost alternative if only 1, 000 cars are 89 affected by a recall?

Distributed RT system model n From the POV of an outside observer, a real-time

Distributed RT system model n From the POV of an outside observer, a real-time (RT) system can be decomposed into three communicating subsystems: – a controlled object (the physical subsystem, the behavior of which is governed by the laws of physics), – a “distributed” computer subsystem (the cyber system, the behavior of which is governed by the programs that are executed on digital computers) – a human user or operator n n The distributed computer system consists of computational nodes that interact by the exchange of messages. A computational node can host one or more computational components. 90

Event-Triggered Control Versus Time-Triggered Control 91

Event-Triggered Control Versus Time-Triggered Control 91

92

92