ARM System On Chip Architecture 1 INTRODUCTION n

  • Slides: 76
Download presentation
ARM System - On - Chip Architecture 1

ARM System - On - Chip Architecture 1

INTRODUCTION n n n ARM is a RISC processor. It is used for small

INTRODUCTION n n n ARM is a RISC processor. It is used for small size and high performance applications. Simple architecture – low power consumption. ARM System - On - Chip Architecture 2

TIMELINE (1/2) n n 1985: Acorn Computer Group manufactures the first commercial RISC microprocessor.

TIMELINE (1/2) n n 1985: Acorn Computer Group manufactures the first commercial RISC microprocessor. 1990: Acorn and Apple participation leads to the founding of Advanced RISC Machines (A. R. M. ). 1991: ARM 6, First embeddable RISC microprocessor. 1992 – 1994: Various companies use ARM (Sharp, Shamsung), while in 1993 ARM 7, the first multimedia microprocessor is introduced. ARM System - On - Chip Architecture 3

TIMELINE (2/2) n n n 1995: Introduction of Thumb and ARM 8. 1996 –

TIMELINE (2/2) n n n 1995: Introduction of Thumb and ARM 8. 1996 – 2000: Alcatel, Huindai, Philips, Sony, use ΑRM, while in 1999 η ARM cooperates with Erickson for the development of Bluetooth. 2000 – 2002: ARM’s share of the 32 – bit embedded RISC microprocessor market is 80%. ARM Developer Suite is introduced. ARM System - On - Chip Architecture 4

THE ARM ARCHITECTURE ARM System - On - Chip Architecture 5

THE ARM ARCHITECTURE ARM System - On - Chip Architecture 5

GENERAL INFO (1/2) AIM: Simple design n Load – store architecture 32 bit data

GENERAL INFO (1/2) AIM: Simple design n Load – store architecture 32 bit data bus 3 addressing modes ARM System - On - Chip Architecture 6

Γενικά (2/2) Simple architecture + Simple instruction set + Code density ARM Small size

Γενικά (2/2) Simple architecture + Simple instruction set + Code density ARM Small size Low power consumption System - On - Chip Architecture 7

Registers n n n 32 general purpose registers 7 modes of operation Different set

Registers n n n 32 general purpose registers 7 modes of operation Different set of visible registers and different cpsr control level in each mode. ARM System - On - Chip Architecture 8

Οι ορατοί καταχωρητές του ARM r 0 r 1 r 2 r 3 r

Οι ορατοί καταχωρητές του ARM r 0 r 1 r 2 r 3 r 4 r 5 r 6 r 7 r 8 r 9 r 10 r 11 r 12 r 13 r 14 r 15 (PC) CPSR user mode usable in user mode system modes only r 8_fiq r 9_fiq r 10_fiq r 11_fiq r 12_fiq r 13_fiq r 14_fiq SPSR_fiq r 13_svc r 14_svc r 13_abt r 14_abt SPSR_svc SPSR_abt svc mode fiq mode ARM abort mode r 13_irq r 14_irq SPSR_und irq mode System - On - Chip Architecture r 13_und r 14_und undefined mode 9

CPSR ARM CPSR format N: Negative Z: Zero C: Carry V: Overflow Q: Saturation

CPSR ARM CPSR format N: Negative Z: Zero C: Carry V: Overflow Q: Saturation (for enhanced DSP instructions) ARM System - On - Chip Architecture 10

Memory Organization ARM n Address bus: 32 – bits n 1 word = 32

Memory Organization ARM n Address bus: 32 – bits n 1 word = 32 – bits System - On - Chip Architecture 11

Instruction Set n Three instruction types n n n Data processing Data transfer Control

Instruction Set n Three instruction types n n n Data processing Data transfer Control flow ARM System - On - Chip Architecture 12

Supervisor mode n n In user mode the operating system handles operations outside user

Supervisor mode n n In user mode the operating system handles operations outside user privileges. Using “supervisor calls”, the user goes to system level and can perform system functions. ARM System - On - Chip Architecture 13

I/O System n n ARM handles peripherals as “memory mapped devices with interrupt support”.

I/O System n n ARM handles peripherals as “memory mapped devices with interrupt support”. Interrupts: n n IRQ: normal interrupt FIQ: fast interrupt ARM System - On - Chip Architecture 14

Exceptions n Exceptions: n n Interrupts Supervisor Call Traps When an exception takes place:

Exceptions n Exceptions: n n Interrupts Supervisor Call Traps When an exception takes place: n n n The value of PC is copied to r 14_exc The operating mode changes into the respective exception mode. The PC takes the exception handler vector address. ARM System - On - Chip Architecture 15

ARM programming model r 0 r 1 r 2 r 3 r 4 r

ARM programming model r 0 r 1 r 2 r 3 r 4 r 5 r 6 r 7 r 8 r 9 r 10 r 11 r 12 r 13 r 14 r 15 (PC) CPSR user mode usable in user mode system modes only r 8_fiq r 9_fiq r 10_fiq r 11_fiq r 12_fiq r 13_fiq r 14_fiq SPSR_fiq r 13_svc r 14_svc r 13_abt r 14_abt SPSR_svc SPSR_abt svc mode fiq mode ARM abort mode r 13_irq r 14_irq SPSR_und irq mode System - On - Chip Architecture r 13_und r 14_und undefined mode 16

THE ARM INSTRUCTION SET ARM System - On - Chip Architecture 17

THE ARM INSTRUCTION SET ARM System - On - Chip Architecture 17

Data Processing Instructions n (1/2) Arithmetic Operations ADD r 0, r 1, r 2

Data Processing Instructions n (1/2) Arithmetic Operations ADD r 0, r 1, r 2 ; r 0: = r 1+r 2 and don’t update flags ADDS r 0, r 1, r 2 ; r 0: = r 1+r 2 and update flags n Logical Operations AND r 0, r 1, r 2 n ; r 0: = r 1 AND r 2 Register Movement MOV r 0, r 2 n Comparison CMP r 1, r 2 ARM System - On - Chip Architecture 18

Data Processing Instructions n (2/2) Operands: n Immediate operands ADD r 3, #1 n

Data Processing Instructions n (2/2) Operands: n Immediate operands ADD r 3, #1 n Shifted register operands: ADD r 3, r 2, r 1, LSL #3 n Miscellaneous data processing instructions: n Multiplication: MUL r 4, r 3, r 2 ARM System - On - Chip Architecture 19

Data transfer instructions n Load and store instructions: LDR r 0, [r 1] STR

Data transfer instructions n Load and store instructions: LDR r 0, [r 1] STR r 0, [r 1] n n Offset: LDR r 0, [r 1, #4] Post – indexed: LDR r 0, [r 1], #16 Auto – indexed: LDR r 0, [r 1, #16]! Multiple data transfers: LDMIA r 1, {r 0, r 2, r 5} ARM System - On - Chip Architecture 20

Control flow instructions n n n Branch instruction: B label Conditional branch: BNE label

Control flow instructions n n n Branch instruction: B label Conditional branch: BNE label Branch and Link: BL label BL … Loop … … loop … … … MOV PC, r 14 ARM ; επιστροφή System - On - Chip Architecture 21

ARM ORGANIZATION AND IMPLEMENTATION ARM System - On - Chip Architecture 22

ARM ORGANIZATION AND IMPLEMENTATION ARM System - On - Chip Architecture 22

3 – Stage Pipeline (ARM 7 – 80 MHz) n n Fetch Decode Execute

3 – Stage Pipeline (ARM 7 – 80 MHz) n n Fetch Decode Execute Throughput: 1 instruction / cycle ARM System - On - Chip Architecture 23

5 – stage pipeline (1/2) n Program execution time: n Ways to reduce n

5 – stage pipeline (1/2) n Program execution time: n Ways to reduce n n : Increase Logic simplification Reduce CPI reduce the number of multicycle instructions. ARM System - On - Chip Architecture 24

5 – stage pipeline (ARM 9150 MHz) (2/2) n n n Fetch Decode Execute

5 – stage pipeline (ARM 9150 MHz) (2/2) n n n Fetch Decode Execute Buffer / Data Write - Back ARM System - On - Chip Architecture 25

ARM coprocessor interface n n O ARM supports upto 16 coprocessors, which can be

ARM coprocessor interface n n O ARM supports upto 16 coprocessors, which can be software emulated. Each coprocessor has upto 16 generalpurpose registers ARM is a load and store architecture. Coprocessors usually handle on – chip functions, such as cache and memory management. ARM System - On - Chip Architecture 26

ARCHITECTURAL SUPPORT FOR HIGH – LEVEL LANGUAGES ARM System - On - Chip Architecture

ARCHITECTURAL SUPPORT FOR HIGH – LEVEL LANGUAGES ARM System - On - Chip Architecture 27

Floating - point accelerator n n (1/2) For floating-point operations, ARM has the FPE

Floating - point accelerator n n (1/2) For floating-point operations, ARM has the FPE software emulator and the FPA 10 hardware floating – point accelerator. FPA 10 includes: n n Coprocessor interface Load / store unit Register bank ( 8 registers 80 – bit ) ALU (adder, mult, div) ARM System - On - Chip Architecture 28

Floating - point accelerator ARM System - On - Chip Architecture (2/2) 29

Floating - point accelerator ARM System - On - Chip Architecture (2/2) 29

APCS (1/2) n n n APCS (ARM Procedure Call Standard) is a set of

APCS (1/2) n n n APCS (ARM Procedure Call Standard) is a set of rules concerning C procedure input and output. Specific use of general purpose registers. (r 0 – r 4: arguments, r 4 – r 8 variables, r 10 stack limit, etc. ) Procedure I/O: BL Loop … MOV pc, lr ARM System - On - Chip Architecture 30

APCS (2/2) Assembly code C code void f 1(int a) { f 2(a); }

APCS (2/2) Assembly code C code void f 1(int a) { f 2(a); } 16 8 f 1 LDR r 0, [r 13] STR r 13!, [r 14] STR r 13!, [r 0] BL f 2 SUB r 13, #4 LDR r 13!, r 15 4 0 Stack pointer ARM System - On - Chip Architecture 31

THUMB PROGRAMMER’S MODEL ARM System - On - Chip Architecture 32

THUMB PROGRAMMER’S MODEL ARM System - On - Chip Architecture 32

General information n n Thumb objective: Code density. Thumb has a 16 – bit

General information n n Thumb objective: Code density. Thumb has a 16 – bit instruction set. A subset of the ARM instruction set is coded to a 16–bit space With appropriate use great benefits can be achieved in terms of n n Power efficiency Enhanced performance ARM System - On - Chip Architecture 33

Going in and out of Thumb mode n Using the BX instruction, in ARM

Going in and out of Thumb mode n Using the BX instruction, in ARM state: e. g. ΒΧ r 0 n n n Commands are assembled as 16 – bit instructions with the appropriate directive If r 0[0] is 1, the T bit in the CPSR becomes 1 and the PC is set to the address obtained from the remaining bits of r 0. Using the BX instruction from Thumb state, we return to ARM state. ARM System - On - Chip Architecture 34

The Thumb programmer’s model n Thumb registers ARM System - On - Chip Architecture

The Thumb programmer’s model n Thumb registers ARM System - On - Chip Architecture 35

ARM vs. Thumb (1/3) n Thumb n n n Upto 70% code size reduction

ARM vs. Thumb (1/3) n Thumb n n n Upto 70% code size reduction 40% more instructions. 45% faster code with 16 -bit memory Requires about 30% less external memory ARM n System - On - Chip Architecture 40% faster code when coupled with a 32 -bit memory 36

ARM vs. Thumb (2/3) n If performance is critical: ARM n If cost and

ARM vs. Thumb (2/3) n If performance is critical: ARM n If cost and power consumption are critical: Thumb ARM System - On - Chip Architecture 37

ARM and Τhumb interaction n n A 32 – bit ARM system can go

ARM and Τhumb interaction n n A 32 – bit ARM system can go into Thumb mode for specific routines, in order to meet power and memory constraints. A 16 – bit system: Can use an on – chip, 32 – bit memory for ARM state routines, and a 16 -bit off – chip memory and Thumb code for the rest of the application. ARM System - On - Chip Architecture 38

ARCHITECTURAL SUPPORT FOR SYSTEM DEVELOPMENT ARM System - On - Chip Architecture 39

ARCHITECTURAL SUPPORT FOR SYSTEM DEVELOPMENT ARM System - On - Chip Architecture 39

The ARM memory interface A basic ARM memory system ARM System - On -

The ARM memory interface A basic ARM memory system ARM System - On - Chip Architecture 40

AMBA (1/4) n Advanced Microcontroller Bus Architecture n n n Advanced High – Performance

AMBA (1/4) n Advanced Microcontroller Bus Architecture n n n Advanced High – Performance Bus Advanced System Bus Advanced Peripheral Bus AMBA objectives: Technology – independence To encourage modular system design ARM System - On - Chip Architecture 41

AMBA (2/4) n A typical AMBA – based system ARM System - On -

AMBA (2/4) n A typical AMBA – based system ARM System - On - Chip Architecture 42

AMBA (3/4) n n AHB bus Burst transaction Split transaction Data bus 64 –

AMBA (3/4) n n AHB bus Burst transaction Split transaction Data bus 64 – 128 bit ARM System - On - Chip Architecture 43

AMBA (4/4) n AMBA Design Kit (ADK) n An environment that assists designers in

AMBA (4/4) n AMBA Design Kit (ADK) n An environment that assists designers in developing ΑΜΒΑ based components και So. C designs. ARM System - On - Chip Architecture 44

Signal Processing Support n n (1/2) Piccolo DSP coprocessor. Various data memories for maximizing

Signal Processing Support n n (1/2) Piccolo DSP coprocessor. Various data memories for maximizing throughput. ARM System - On - Chip Architecture 45

Signal Processing Support n (2/2) Piccolo ARM System - On - Chip Architecture 46

Signal Processing Support n (2/2) Piccolo ARM System - On - Chip Architecture 46

MEMORY HIERARCHY ARM System - On - Chip Architecture 47

MEMORY HIERARCHY ARM System - On - Chip Architecture 47

Memory hierarchy Larger size Lower speed Memory type Registers On – chip cache Size

Memory hierarchy Larger size Lower speed Memory type Registers On – chip cache Size Speed 32 – bit 8– 32 kbytes A few nsec 10 nsec Off – chip cache RAM 100 – 200 10 – 30 kbytes nsec Mbytes 100 nsec ARM System - On - Chip Architecture 48

On – chip memory n n Necessary for performance Some system prefer RAM to

On – chip memory n n Necessary for performance Some system prefer RAM to on – chip cache. Simpler, cheaper and less powerhungry. ARM System - On - Chip Architecture 49

Cache types n Cache types: n n n Unified cache. Separate instruction and data

Cache types n Cache types: n n n Unified cache. Separate instruction and data caches. Performance: hit rate – miss rate Compulsory miss: first time and address is accessed Capacity miss: When cache full Conflict miss: Two addresses compete for the same place in the cache ARM System - On - Chip Architecture 50

Replacement policy -implementation n n n Least Recently Used (LRU) Least Frequently Used (LFU)

Replacement policy -implementation n n n Least Recently Used (LRU) Least Frequently Used (LFU) Data prediction Fully-associative Direct-mapped Set-associative ARM System - On - Chip Architecture 51

Direct – mapped cache n (1/2) A line of data stored in a tag

Direct – mapped cache n (1/2) A line of data stored in a tag of memory ARM System - On - Chip Architecture 52

Direct – mapped cache n n n (2/2) Each memory location has a specific

Direct – mapped cache n n n (2/2) Each memory location has a specific place in the cache. Tag and data can be accessed at the same time. Tag RAM smaller than data RAM and has a smaller access time allowing the comparison to complete before accessing the data RAM. ARM System - On - Chip Architecture 53

n 2 – way set – associative cache. (1/3) ARM System - On -

n 2 – way set – associative cache. (1/3) ARM System - On - Chip Architecture 54

Set associative cache (2/3) n n A set – associative cache has a number

Set associative cache (2/3) n n A set – associative cache has a number of sets yielding n – way associative cache. Two addresses that would be competing for the same spot in a direct mapped cache, can be stored in different locations and accessed independently. ARM System - On - Chip Architecture 55

Set associative (3/3) n Set selection: n n n Random allocation Least recently used

Set associative (3/3) n Set selection: n n n Random allocation Least recently used (LRU) Round – robin (cyclic) ARM System - On - Chip Architecture 56

Fully associative (1/2) ARM System - On - Chip Architecture 57

Fully associative (1/2) ARM System - On - Chip Architecture 57

Write strategies n Write – through All write operations are passed to main memory

Write strategies n Write – through All write operations are passed to main memory n Write – through with buffered write Write operations are passed to main memory through the write buffer n Copy – back (write – back) Write operations update only the cache. ARM System - On - Chip Architecture 58

Cache feature summary ARM System - On - Chip Architecture 59

Cache feature summary ARM System - On - Chip Architecture 59

‘Perfect’ cache performance ARM System - On - Chip Architecture 60

‘Perfect’ cache performance ARM System - On - Chip Architecture 60

MMU (1/3) n n Two memory management approaches: Segmentation Paging ARM System - On

MMU (1/3) n n Two memory management approaches: Segmentation Paging ARM System - On - Chip Architecture 61

MMU (2/3) n Segmented memory management: ARM System - On - Chip Architecture 62

MMU (2/3) n Segmented memory management: ARM System - On - Chip Architecture 62

MMU (3/3) n Paging memory management: ARM System - On - Chip Architecture 63

MMU (3/3) n Paging memory management: ARM System - On - Chip Architecture 63

ARCHITECTURAL SUPPORT FOR OPERATING SYSTEMS ARM System - On - Chip Architecture 64

ARCHITECTURAL SUPPORT FOR OPERATING SYSTEMS ARM System - On - Chip Architecture 64

CP 15 n n On – chip coprocessor for MMU, cache, protection unit control.

CP 15 n n On – chip coprocessor for MMU, cache, protection unit control. Control takes place through registers with instructions executed in supervisor mode. ARM System - On - Chip Architecture 65

Protection Unit n n Simpler alternative to the MMU. Requires simpler software and hardware.

Protection Unit n n Simpler alternative to the MMU. Requires simpler software and hardware. Does not use translation tables, but 8 protection regions instead. ARM System - On - Chip Architecture 66

ARM DEVELOPER SUITE ARM System - On - Chip Architecture 67

ARM DEVELOPER SUITE ARM System - On - Chip Architecture 67

ARMULATOR (1/2) n n n Armulator: Emulator of various ARM processors. Allows project development

ARMULATOR (1/2) n n n Armulator: Emulator of various ARM processors. Allows project development in C, C++ or Assembly. It includes debugger, compilers, assembler and this entire set is called ARM Developer Suite (ADS). ARM System - On - Chip Architecture 68

ARMULATOR (2/2) n Possible project options: n n ARM and Thumb Interworking Mixing C,

ARMULATOR (2/2) n Possible project options: n n ARM and Thumb Interworking Mixing C, C++ and Assembly Code for ROM Exception handlers MM ARM System - On - Chip Architecture 69

ARMULATOR TUTORIAL n CODEWARRIOR ENVIRONMENT ARM System - On - Chip Architecture 70

ARMULATOR TUTORIAL n CODEWARRIOR ENVIRONMENT ARM System - On - Chip Architecture 70

ARM System - On - Chip Architecture 71

ARM System - On - Chip Architecture 71

ARM System - On - Chip Architecture 72

ARM System - On - Chip Architecture 72

ARM System - On - Chip Architecture 73

ARM System - On - Chip Architecture 73

ARM System - On - Chip Architecture 74

ARM System - On - Chip Architecture 74

ARM System - On - Chip Architecture 75

ARM System - On - Chip Architecture 75

ARM System - On - Chip Architecture 76

ARM System - On - Chip Architecture 76