Computing Platforms Chapter 4 COE 306 Introduction to

  • Slides: 80
Download presentation
Computing Platforms Chapter 4 COE 306: Introduction to Embedded Systems Dr. Aiman El-Maleh Computer

Computing Platforms Chapter 4 COE 306: Introduction to Embedded Systems Dr. Aiman El-Maleh Computer Engineering Department College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals

Next. . . v Basic Computing Platforms v The CPU bus v Direct Memory

Next. . . v Basic Computing Platforms v The CPU bus v Direct Memory Access (DMA) v System Bus Configurations v ARM Bus: AMBA 2. 0 v Memory Components v Embedded Platforms v Platform-Level Performance Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 2

Computing Platforms v Computing platforms are created using microprocessors, I/O devices, and memory components

Computing Platforms v Computing platforms are created using microprocessors, I/O devices, and memory components v A CPU bus is required to connect the CPU to other devices v Software is required to implement an application v Embedded system software is closely tied to the hardware v Computing Platform: hardware and software Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 3

Platform Hardware Components ² DMA controller provides direct memory access ² Timers are used

Platform Hardware Components ² DMA controller provides direct memory access ² Timers are used by operating system ² A high-speed bus, connected to CPU bus through a bridge, allows fast devices to communicate efficiently ² A low-speed bus provides an inexpensive way to connect simpler devices Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 4

Example: PIC 16 F 882 v Harvard architecture---flash memory separately programmed. v Multiple I/O

Example: PIC 16 F 882 v Harvard architecture---flash memory separately programmed. v Multiple I/O devices Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 5

Example: Intel Strong. ARM SA-1100 v The system control module contains: ² a real-time

Example: Intel Strong. ARM SA-1100 v The system control module contains: ² a real-time clock ² an operating system timer ² 28 general-purpose I/Os ² an interrupt controller ² a power manager controller ² a reset controller that handles resetting the processor. v The SA-1111 is a companion chip that provides a suite of I/O functions: USB host controller; PS/2 ports; PCMCIA interface; SSP serial port Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 6

Platform Software Components v Hardware abstraction layer (HAL) provides basic level of abstraction from

Platform Software Components v Hardware abstraction layer (HAL) provides basic level of abstraction from hardware v Operating system and file system provide basic abstractions required to build complex applications v Library routines used to perform complex kernel functions v Application makes use of all these layers, either directly or indirectly Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 7

Embedded Software Stack v A HAL (Hardware Abstraction Layer) ² defines a set of

Embedded Software Stack v A HAL (Hardware Abstraction Layer) ² defines a set of routines, protocols and tools for interacting with the hardware ² focused on creating high level functions that can be used to make hardware do something without having detailed knowledge of how it is doing it ² allows changing hardware without changing application ² Example: Cortex Microcontroller Software Interface Standard Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 8

Embedded Software Stack v An API (Application Programming Interface) ² is an application programming

Embedded Software Stack v An API (Application Programming Interface) ² is an application programming interface that defines a set of routines, protocols and tools for creating an application ² defines the high level interface of the behavior and capabilities of the component and its inputs and outputs ² acts as a toolkit to help high level developers quickly generate application code ² provides common interface code for controlling the real-time behavior of the system and accessing common components such as serial communication and file accesses v Using a layered software architecture can dramatically increase the re-usability of embedded software Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 9

The CPU bus v CPU Bus is collection of wires, and the protocol, by

The CPU bus v CPU Bus is collection of wires, and the protocol, by which the CPU communicates with memory and devices v The CPU is the bus master: it initiates all transfers v Control: e. g. data ready, read/write Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 10

Bus Protocols v Bus protocol determines how devices communicate v The basic building block

Bus Protocols v Bus protocol determines how devices communicate v The basic building block of most bus protocols is the four -cycle handshake 1. Device 1 raises enq 2. Device 2 responds with ack 3. Device 1 lowers enq once it has finished. 4. Device 2 lowers ack Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 11

Timing Diagrams v Behavior of a bus is most often specified as a timing

Timing Diagrams v Behavior of a bus is most often specified as a timing diagram Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 12

Read Followed by Write Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide

Read Followed by Write Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 13

Reading From A Slow Device Computing Platforms COE 306– Introduction to Embedded System– KFUPM

Reading From A Slow Device Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 14

Burst Read Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 15

Burst Read Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 15

Direct Memory Access (DMA) v Direct Memory Access performs data transfers without executing instructions

Direct Memory Access (DMA) v Direct Memory Access performs data transfers without executing instructions ² CPU sets up transfer ² DMA controller fetches, writes v Allows hardware subsystems to access main memory without involving the CPU Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 16

DMA Controller v The CPU controls the DMA controller by setting 3 registers: ²

DMA Controller v The CPU controls the DMA controller by setting 3 registers: ² Starting address: where the transfer begins ² Length: number of words to be transferred ² Status: to operate the DMA controller v To start a transfer, the CPU sets the 3 registers v Once done, the DMA controller interrupts the CPU v During a DMA transfer, the CPU cannot use the bus ² It can still use the cache and its registers Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 17

DMA Controller v Once DMA is bus master, it transfers automatically ² May run

DMA Controller v Once DMA is bus master, it transfers automatically ² May run continuously until complete ² May use every nth bus cycle v To prevent the CPU from idling for too long, most DMA controllers return control to the CPU after transferring a preset number of words, e. g. 4, 8, or 16 Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 18

System Bus Configurations v A microprocessor system often has more than one bus ²

System Bus Configurations v A microprocessor system often has more than one bus ² High-speed devices connected to a high-performance bus ² Lower-speed devices are connected to a different bus ² a bridge allows the buses to connect to each other Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 19

Multiple Buses v Reasons for using multiple system buses: ² Higher-speed buses may use

Multiple Buses v Reasons for using multiple system buses: ² Higher-speed buses may use wider data connections ² Higher-speed buses require more expensive circuits and connectors ² Lower-speed devices can use lower-speed circuits and connectors, lowering their prices ² Bridges connecting two buses may allow them to operate independently § I/O parallelism Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 20

Bus Bridge v Slave on the fast bus v Master on the slow bus

Bus Bridge v Slave on the fast bus v Master on the slow bus v Protocol translator Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 21

Standard Bus Architectures v AMBA (ARM) v Core. Connect (IBM) v Sonics Smart Interconnect

Standard Bus Architectures v AMBA (ARM) v Core. Connect (IBM) v Sonics Smart Interconnect (Sonics) widely used v STBus (STMicroelectronics) v Wishbone (Opencores) v Avalon (Altera) v PI Bus (OMI) v MARBLE (Univ. of Manchester) v Core. Frame (Palm. Chip) Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 22

ARM Bus: AMBA 2. 0 v Advanced Microcontroller Bus Architecture (AMBA) ² Open standard

ARM Bus: AMBA 2. 0 v Advanced Microcontroller Bus Architecture (AMBA) ² Open standard specification for the connection and management of functional blocks in a System-on-Chip (So. C) ² Supports CPUs, memories, and peripherals in a So. C ² Defines multiple buses, e. g. AHB, ASB, APB, . . . etc. ² Features: pipelining, burst transfers, split transactions, multiple masters, . . . etc. Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 23

AMBA Example: LPC 1768 Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide

AMBA Example: LPC 1768 Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 24

Advanced High-Performance Bus (AHB) v High performance, pipelined operation, burst transfers, multiple bus masters,

Advanced High-Performance Bus (AHB) v High performance, pipelined operation, burst transfers, multiple bus masters, split transactions Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 25

AHB Arbitration v Arbitration protocol is specified, but not the arbitration policy Computing Platforms

AHB Arbitration v Arbitration protocol is specified, but not the arbitration policy Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 26

AHB Signals Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 27

AHB Signals Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 27

AHB Signals Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 28

AHB Signals Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 28

Overview of AMBA AHB operation v Every transfer consists of ² an address and

Overview of AMBA AHB operation v Every transfer consists of ² an address and control cycle ² one or more cycles for the data v The data can be extended using the HREADY signal ² When LOW this signal causes wait states to be inserted v During a transfer the slave shows the status using the response signals, HRESP[1: 0] ² OKAY: transfer is progressing normally ² ERROR: indicates that a transfer error has occurred ² RETRY and SPLIT: indicate that the transfer cannot complete immediately but bus master should continue to attempt transfer Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 29

AHB Basic Transfer v An AHB transfer consists of two distinct sections: ² The

AHB Basic Transfer v An AHB transfer consists of two distinct sections: ² The address phase, which lasts only a single cycle ² The data phase, which may require several cycles. This is achieved using the HREADY signal Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 30

AHB Basic Transfer v Data transfer with slave wait states Computing Platforms COE 306–

AHB Basic Transfer v Data transfer with slave wait states Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 31

AHB Pipelining v Transaction pipelining increases bus bandwidth Computing Platforms COE 306– Introduction to

AHB Pipelining v Transaction pipelining increases bus bandwidth Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 32

Cost of Arbitration in AHB A master gains ownership of the address bus when

Cost of Arbitration in AHB A master gains ownership of the address bus when HGRANTx is HIGH and HREADY is HIGH at the rising edge of HCLK Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 33

AHB Pipelined Burst Transfers v Bursts cut down on arbitration, handshaking time, improving performance

AHB Pipelined Burst Transfers v Bursts cut down on arbitration, handshaking time, improving performance Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 34

AHB Burst Types v Incremental bursts access sequential locations ² e. g. 0 x

AHB Burst Types v Incremental bursts access sequential locations ² e. g. 0 x 64, 0 x 68, 0 x 6 C, 0 x 70 for INCR 4, transferring 4 byte data v Wrapping bursts “wrap around” address if starting address is not aligned to total no. of bytes in transfer ² e. g. 0 x 64, 0 x 68, 0 x 6 C, 0 x 60 for WRAP 4, transferring 4 byte data Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 35

AHB Control Signals v Transfer direction ² HWRITE – write transfer when high, read

AHB Control Signals v Transfer direction ² HWRITE – write transfer when high, read transfer when low v Transfer size ² HSIZE[2: 0] indicates the size of the transfer Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 36

AHB Transfer Type Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 37

AHB Transfer Type Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 37

Different Transfer Types Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 38

Different Transfer Types Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 38

Undefined-Length Bursts Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 39

Undefined-Length Bursts Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 39

AHB Transfer Response Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 40

AHB Transfer Response Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 40

AHP Transfer with Error Response Computing Platforms COE 306– Introduction to Embedded System– KFUPM

AHP Transfer with Error Response Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 41

AHP Transfer with Retry Response Computing Platforms COE 306– Introduction to Embedded System– KFUPM

AHP Transfer with Retry Response Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 42

AHB Split Transfers v The SPLIT response provides a mechanism for slaves to release

AHB Split Transfers v The SPLIT response provides a mechanism for slaves to release the bus when they are unable to supply data for a transfer Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 43

Retry and Split Transfers v The SPLIT and RETRY response combinations allow slaves to

Retry and Split Transfers v The SPLIT and RETRY response combinations allow slaves to delay the completion of a transfer, but free up the bus for use by other masters. v For RETRY the arbiter will continue to use the normal priority scheme and therefore only masters having a higher priority will gain access to the bus. v For a SPLIT transfer the arbiter will adjust the priority scheme so that any other master requesting the bus will get access, even if it is a lower priority. v In order for a SPLIT transfer to complete the arbiter must be informed when the slave has the data available. Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 44

AHB Bus Master Interface Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide

AHB Bus Master Interface Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 45

AHB Bus Slave Interface Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide

AHB Bus Slave Interface Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 46

AHB Arbiter Interface Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 47

AHB Arbiter Interface Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 47

AMBA Advanced Peripheral Bus (APB) v Low power, latched address and control, simple interface,

AMBA Advanced Peripheral Bus (APB) v Low power, latched address and control, simple interface, suitable for many peripherals v No (multi-cycle) bursts, no pipelined transfers v Bus activity described by a state diagram ² IDLE: The default state for peripheral bus ² SETUP: When a transfer is required, bus moves into SETUP state, PSELx, is asserted ² ENABLE: PENABLE is asserted, address, write and select signals remain stable during transition from SETUP to ENABLE state Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 48

APB Write Transfer v To reduce power consumption, address and write signals will not

APB Write Transfer v To reduce power consumption, address and write signals will not change after a transfer until the next access occurs Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 49

APB Read Transfer Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 50

APB Read Transfer Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 50

AHB-APB Bridge v The bridge converts system bus transfers into APB transfers and performs

AHB-APB Bridge v The bridge converts system bus transfers into APB transfers and performs the following functions: ² Latches the address and holds it valid throughout the transfer ² Decodes the address and generates a peripheral select, PSELx ² Drives the data onto the APB for write transfer ² Drives the APB data onto the system bus for a read transfer ² Generates a timing strobe, PENABLE, for the transfer Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 51

APB Slave Interface v The APB slave interface is very flexible. v For a

APB Slave Interface v The APB slave interface is very flexible. v For a write transfer the data can be latched at the following points: ² on either rising edge of PCLK, when PSEL is HIGH ² on the rising edge of PENABLE, when PSEL is HIGH. v For read transfers the data can be driven on to the data bus when PWRITE is LOW and both PSELx and PENABLE are HIGH. Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 52

Interfacing APB to AHB: Read Transfer Computing Platforms COE 306– Introduction to Embedded System–

Interfacing APB to AHB: Read Transfer Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 53

Interfacing APB to AHB: Burst of Read Transfers Computing Platforms COE 306– Introduction to

Interfacing APB to AHB: Burst of Read Transfers Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 54

Interfacing APB to AHB: Write Transfer Computing Platforms COE 306– Introduction to Embedded System–

Interfacing APB to AHB: Write Transfer Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 55

Memory Components v Several different types of memory: ² RAM: DRAM, SRAM ² ROM:

Memory Components v Several different types of memory: ² RAM: DRAM, SRAM ² ROM: EEPROM, Flash v 2 -D array: row address and column address v Each type of memory comes in varying: ² Capacities ² Widths v Packaging: single in-line memory modules (SIMMs) vs. dual in-line memory modules (DIMMs) ² SIMM(32 -bit data bus), DIMM (64 -bit data bus) Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 56

Random-Access Memory v Dynamic RAM is dense, refreshed periodically; inaccessible during refresh v Static

Random-Access Memory v Dynamic RAM is dense, refreshed periodically; inaccessible during refresh v Static RAM is faster, less dense, consumes more power v SDRAM: sync. , command pipelining, interleaved banks ² SDR 1 word/cycle 133 MHz 3. 3 V ² DDR 2 words/cycle 200 MHz 2. 5 V ² DDR 2 4 words/cycle 533 MHz 1. 8 V ² DDR 3 8 words/cycle 1066 MHz 1. 5 V ² DDR 4 8 words/cycle 1600 MHz 1. 2 V v Burst access: perform several accesses in sequence using a single address Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 57

SDRAM Operation Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 58

SDRAM Operation Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 58

Memory Controllers v Memory has complex internal organization v Memory controller hides details of

Memory Controllers v Memory has complex internal organization v Memory controller hides details of memory interface, schedules transfers to maximize performance v Provide additional features: multiple requests, additional burst access Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 59

Memory Channels and Banks v Channels and banks are two ways to add parallelism

Memory Channels and Banks v Channels and banks are two ways to add parallelism to the memory system v Each channel has its own memory components and its own connection to the processor v CPU can perform multiple independent accesses using different channels v Banks are separate memory arrays, can perform accesses in parallel Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 60

Example Embedded Platforms: ARDUINO v HTTP: //ARDUINO. CC/ ² Open-source electronics prototyping platform ²

Example Embedded Platforms: ARDUINO v HTTP: //ARDUINO. CC/ ² Open-source electronics prototyping platform ² Intended for artists, designers, hobbyists ² Uses Atmel’s ATMEGA microcontrollers ² AVR architecture: modified Harvard 8 -bit RISC Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 61

Example Embedded Platforms: BEAGLEBOARD v HTTP: //BEAGLEBOARD. ORG ² Credit-card sized, low-power, open-hardware computers

Example Embedded Platforms: BEAGLEBOARD v HTTP: //BEAGLEBOARD. ORG ² Credit-card sized, low-power, open-hardware computers ² Uses ARM Cortex-A 8 by TI – started by a TI engineer ² 512 MB DDR 3 RAM, 3 D graphics, USB, Ethernet, HDMI ² Runs Ubuntu, Android, Ångström Linux, and others Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 62

Example Embedded Platforms: LPCXPRESSO v HTTP: //WWW. LPCWARE. COM/LPCXPRESSO ² Uses LPC microcontrollers, by

Example Embedded Platforms: LPCXPRESSO v HTTP: //WWW. LPCWARE. COM/LPCXPRESSO ² Uses LPC microcontrollers, by NXP ² Free version of development software ² Includes debugging module Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 63

Choosing a Platform v CPU: architecture, clock speed, integrated peripherals v Bus: data bandwidth.

Choosing a Platform v CPU: architecture, clock speed, integrated peripherals v Bus: data bandwidth. May affect choice of CPU v Memory: size, speed, RAM, ROM, on-chip, off-chip v I/O devices: integrated devices vs. custom PCB v Run-time software: operating system, libraries v Support software: development environment, debugging tools Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 64

Development Environment v Host and target are connected by a USB or Ethernet link

Development Environment v Host and target are connected by a USB or Ethernet link ² Host: where development happens ² Target: where the code will finally run v Target must support host communication ² Small software, interrupt vectors v The host should be able to ² Load programs into the target ² Start and stop program execution on target ² Examine memory and CPU registers v A cross-compiler runs on the host and generates code that runs on the target Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 65

Debugging v Cross debugger: ² displays target state, allows target system to be controlled

Debugging v Cross debugger: ² displays target state, allows target system to be controlled v Software debugger ² A monitor program residing on the target provides basic debugger functions ² Debugger should have a minimal footprint in memory v Breakpoints ² A breakpoint allows the user to stop execution, examine system state, and change state ² Replace the break pointed instruction with a subroutine call to the monitor program Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 66

Debugging v Debugging Tools ² I/O: LEDs, UART, other peripherals § The embedded alternative

Debugging v Debugging Tools ² I/O: LEDs, UART, other peripherals § The embedded alternative of the PC’s print § USB: debugging, diagnosis, upgrades ² ICE: In-Circuit Emulators § Specialized hardware that allows inspecting and modifying CPU state ² Logic analyzer (oscilloscope) v Debugging Challenges ² May be hard to generate realistic inputs ² Timing errors in real-time code Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 67

Platform-Level Performance v Performance depends on all the elements of the system ² CPU

Platform-Level Performance v Performance depends on all the elements of the system ² CPU (provides an upper bound on performance) ² Cache ² Bus ² Main memory ² I/O device v Bandwidth is the rate of data movement (in seconds) v Bandwidth captures performance of several components ² Memory ² Bus ² CPU Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 68

Bandwidth as Performance v Different parts of the system run at different clock rates

Bandwidth as Performance v Different parts of the system run at different clock rates v Different components may have different widths (bus, memory) v Example: video transfer ² Frame rate: 30 fps, Frame size: 320 x 240, 3 bytes/pixel ² What is the required bandwidth? v Required bandwidth: 320 x 240 x 30=6, 912, 000 B/s ² 1 MHz 8 -bit wide bus --too slow ² How can we make the bus satisfy bandwidth requirements? § Increase bus speed to 7 MHz with 8 -bit wide bus § Increase bus width to 32 bits and bus clock rate to 2 MHz Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 69

Bus Bandwidth Modeling v Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide

Bus Bandwidth Modeling v Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 70

Bus Burst Transfer Bandwidth v Computing Platforms COE 306– Introduction to Embedded System– KFUPM

Bus Burst Transfer Bandwidth v Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 71

Memory Performance v Memory bandwidth is determined by memory width ² 64 Mb: 64

Memory Performance v Memory bandwidth is determined by memory width ² 64 Mb: 64 M x 1 bit, 8 M x 1 byte, 2 M x 32 bits v Multiple memories can be used to build wider memory v Memory modules can determine memory width, e. g. SIMMs, DIMMs v Preferred width depends on data format and required speed v Data width and memory width may not align ² Pixel: 3 bytes, W = 4 bytes Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 72

Memory Performance v Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 73

Memory Performance v Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 73

Bus Performance Bottlenecks v Objective: Read 30 -fps 320 x 240 video from memory

Bus Performance Bottlenecks v Objective: Read 30 -fps 320 x 240 video from memory to CPU ² Bus: 1 MHz, W = 2 B, D = 1, O = 2 ² Memory: 10 MHz, B = 8, W = 1 B, D = 1, O = 3 ² Is performance bottleneck bus or memory? v N = 320 x 240 x 3 = 6, 912, 000 bytes v Tbus = (1 + 2) (6, 912, 000/2) = 10, 368, 000 cycles v tbus = 10, 368, 000 x 10 -6= 10. 368 seconds v Tmem = (8 x 1 + 3) (6, 912, 000/8 x 1) = 9, 504, 000 cycles v tmem = 9, 504, 000 x 10 -7 = 0. 9504 seconds v How do we make it work? Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 74

Bus Performance Examples v We would like to investigate the performance of a bus

Bus Performance Examples v We would like to investigate the performance of a bus system in relation to a digital audio application. Digital audio is specified by three main parameters: ² Number of channels, e. g. stereo audio uses two channels. ² Sampling rate: number of digital samples per second. ² Sample size (or bit depth): number of bits per sample. v Assume a system bus that runs at 1. 5 MHz, and requires a total of 6 cycles to complete a single 16 -bit transfer. Assuming uncompressed 6 -channel audio (5. 1 speaker configuration), what is the best combination of sampling rate and sample size that can be handled by this system bus? Justify your choice. Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 75

Bus Performance Examples v Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide

Bus Performance Examples v Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 76

Bus Performance Examples v We can either reduce the sampling rate, or the sample

Bus Performance Examples v We can either reduce the sampling rate, or the sample size, to lower the required bandwidth to match the system bus capacity: ² Naudio 1 = 6 x 32, 000 x 16 = 3, 072, 000 bit/second < 4 x 106 ² Naudio 2 = 6 x 44, 100 x 8 = 2, 116, 800 bit/second < 4 x 106 v Since Naudio 2 halves the sample size, it is expected to have a greater impact on audio quality. Hence, reducing the sampling rate is preferred, resulting in a sampling rate of 32 k. Hz and a sample size of 16 bits. v Furthermore, we can achieve better audio quality by increasing the sample size of Naudio 1 from 16 to 20 bits: ² Naudio 3 = 6 x 32, 000 x 20 = 3, 840, 000 bit/second < 4 x 106 Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 77

Bus Performance Examples v A real-time system receives data through an I/O device, the

Bus Performance Examples v A real-time system receives data through an I/O device, the CPU processes the data, then the results of the processing are transferred to system memory. v The I/O device, the CPU, and the memory controller are all on the same system bus, which runs at 1 MHz. The CPU runs at 10 MHz. v Each bus transaction (transfer) between any two devices on the bus takes 5 bus cycles, 1 of which is used to transfer data, and the remaining cycles are used by the bus protocol. The bus has 32 data lines, transferring 32 bits per data-transfer cycle. v The I/O device receives 512 bytes at a time. Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 78

Bus Performance Examples v While processing the received data, for each received byte, the

Bus Performance Examples v While processing the received data, for each received byte, the CPU generates 4 bytes. Only generated data is transferred from the CPU to system memory. v If the I/O device receives new data at a rate of 200 times per second (512 bytes each), how many CPU cycles can be spent processing each byte without violating the real-time requirements? v Assume that the memory is fast enough to handle any requests received by the memory controller. Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 79

Bus Performance Examples v For each second: ² NI/O = 512 B x 200

Bus Performance Examples v For each second: ² NI/O = 512 B x 200 = 102400 B ² TI/O(N) = (D +O) N/W = 5 x 102400/4= 128000 cycles ² Nmem = 512 B x 200 x 4 = 409600 B ² Tmem(N) = 5 x 409600/4 = 512000 cycles ² Tbus = 128000+ 512000= 640000 cycles v tbus = Tbus P = 640000 x 10 -6 = 0. 64 s v t. CPU = 1 - 0. 64 = 0. 36 s v TCPU = t. CPU * f. CPU = 0. 36 x 106 = 3600000 cycles v Number of CPU cycles can be spent processing each byte without violating real-time requirements = 3600000 / 102400 = 35. 156 => 35 cycles per Byte. Computing Platforms COE 306– Introduction to Embedded System– KFUPM slide 80