Appendix A Instruction Set Principles and Examples Classifying
- Slides: 40
Appendix A: Instruction Set Principles and Examples • • Classifying Instruction Set Architecture Memory addressing mode Operations in the instruction set Control flow instructions Instruction format Structure of recent compilers MMX technology MIPS instruction set 1
Introduction • An instruction set architecture is a specification of a standardized programmer-visible interface to hardware, comprised of: – A set of instructions (really, instruction types) • With associated argument fields, assembly syntax, and machine encoding. – A set of named storage locations • Registers, memory, … Programmer-accessible caches? – A set of addressing modes (ways to name locations) – Often an I/O interface (usually memory-mapped) 2
Classifying Architectures • One important classification scheme is by the type of addressing modes supported. – Stack architecture: Operands implicitly on top of a stack. (Early machines. ) – Accumulator architecture: One operand is implicitly an accumulator (a special register). (Early machs. ) – General-purpose register architecture: Operands may be any of a large (typically 10 s-100 s) # of registers. • Register-memory architectures: One op may be memory. • Load-store architectures: All ops are registers, except in special load and store instructions. 3
Four Architecture Classes Assembly for C: =A+B: 4
Number of Operands A further classification is by the maximum number of operands, and # that can be memory: e. g. , – 2 -operand (e. g. a += b) • src/dest(reg), src(reg) • src/dest(reg), src(mem) IBM 360, x 86, 68 k • src/dest(mem), src(mem) VAX – 3 -operand (e. g. a = b+c) • dest(reg), src 1(reg), src 2(reg) MIPS, PPC, SPARC, &c. • dest(reg), src 1(reg), src 2(mem) IBM 370 • dest(mem), src 1(mem), src 2(mem) IBM 370, VAX 5
Further Classification # of Memory Operands # of Operands Type of Architecture Examples 0 3 Register-register Alpha, ARM, MIPS, Power. PC, Sparc, etc 1 2 Register-memory IBM 360/370, Intel 80 x 86, Motorola 68000, TI C 54 x 2 2 Memory-memory VAX 3 3 Memory-memory VAX 6
Comparison of Architecture Types Type Instruction Encoding Code Generation # of Clock Cycles/Inst. Code Size Registerregister Fixed-length Simple Similar Large Registermemory Easy Moderate Different Medium Memorymemory Variablelength Complex Large variation Compact Advantages Disadvantages 7
Endians & Alignment 7 6 5 4 3 2 1 Increasing byte address 0 4 Word-aligned word at byte address 4. 2 Halfword-aligned word at byte address 2. 1 Byte-aligned (non-aligned) word, at byte address 1. word 3 (MSB) 2 1 0 (LSB) word 0 (LSB) 1 2 3 (MSB) Little-endian byte order (least-significant byte “first”). Big-endian byte order (most-significant byte “first”). 8
Addressing Modes • In example assembly syntax in middle column, ( ) indicates memory access. (A typical syntax. ) • In RTL syntax on right, [ ] denotes accessing a member of an array, Register or Memory. 9
Addressing Mode Usage 3 SPEC 89 on VAX 10
Displacement Distribution SPEC CPU 2000 on Alpha Sign bit is not counted 11
Use of Immediate Operand 12
Distribution of Immediate SPEC CPU 2000 on Alpha Sign bit is not counted 13
Instruction Type 14
Instruction Distribution (5 SPECint 92) 15
Control Flow Instructions • Four basic types: – – (Conditional) branches (Unconditional) jumps Procedure calls Procedure returns • Control flow addressing modes: – Often PC-relative (PC + displacement). Relocatable. – Also useful: register indirect jumps (reg. has addr. ). Uses: • Procedure returns • Case / switch statements • Virtual functions / methods (abstract class method calls) • High-order functions / function pointers • Dynamically shared libraries 16
Conditional Branch Options • Condition Code (CC) Register – E. g. : X 86, ARM, PPC, SPARC, … – ALU ops set condition code flags in the CCR – Branch just checks the flag • Condition register – E. g. : Alpha, MIPS – Comparison instruction puts result in a GPR – Branch instruction checks the register • Compare & Branch – E. g. : PA-RISC, VAX – Compare & branch in 1 instruction. 17
Procedure Calling Conventions • Two major calling conventions: – Caller saves: • Before the call, procedure caller saves registers that will be needed later, even if callee did not use them – Callee saves: • Inside the call, called procedure saves registers that it will overwrite • Can be more efficient if many small procedures • Many architectures use a combination of schemes: – E. g. , MIPS: Some registers caller-saves, some callee-saves 18
Three Classes of Control Instructions SPEC CPU 2000 on Alpha 19
Branch Distance Distribution SPEC CPU 2000 on Alpha 20
Branch Comparison Types SPEC CPU 2000 on Alpha 21
Encoding An Instruction Set 22
Compiler Structure 23
Compiler Optimizations 24
Compiler Optimizations (cont. ) 25
Effect of Optimization 26
Architectural Support for Compiler • Provide regularity – Orthogonality (independence) of: • Registers used • Addressing modes • Operations used • Provide primitives, not solutions – Don’t directly support specific kernels or languages • Simplify trade-offs among alternatives – Make easy to tell fastest code sequence @ compile time • Don’t interpret values known at compile time – Allow compile-time constants to be provided in immediates 27
MIPS Architecture • RISC, load-store architecture, simple address • 32 -bit instructions, fixed format • 32 64 -bit GPRs, R 0 -R 31. – Really, only 31 – R 0 is just a constant 0. • 32 64 -bit FPRs, F 0 -F 31 – Can hold 32 -bit floats also (with other ½ unused). – “SIMD” extensions operate on more floats in 1 FPR • A few special registers – Floating-point status register • Load/store 8 -, 16 -, 32 -, 64 -bit integers – All sign-extended to fill 64 -bit GPR – Also 32 - bit floats/doubles 28
MIPS Addressing Modes • Register (arith. /logical ops only) • Immediate (arith. /logical only) & Displacement (load/stores only) – 16 -bit immediate / offset field – Register indirect: use 0 as displacement offset – Direct (absolute): use R 0 as displacement base • Byte-addressed memory, 64 -bit address • Software-settable big-endian/little-endian flag • Alignment required 29
Inst. Format: I-type Instructions 30
Inst. Format: R-type Instructions 31
Inst. Format: J-type Instructions 32
MIPS Instruction Set • Go through Figures A. 23 -A. 25 in textbook, – Loads and stores in MIPS, Figure A. 23 – Arithmetic and logical instructions, Figure A. 24 – Control flow instructions, Figure A. 25 • More on Appendix A: Figure A. 26 – A. 30. 33
MIPS Dynamic Instr. Frequencies Integer benchmarks FP benchmarks 34
Multimedia Extensions • Graphics displays work on pixels: 8, 16, 32 bits per pixel to define pixel colors • Audio samples of 16, 24 bits • Exploit subword parallelism using existing 64/128 bit registers and ALUs • Intel i 860, first (1989) to operate on 8 8 -bit, 4 16 bit, or 2 32 -bit operands on 64 -bit ALUs • Almost all microprocessors have media extensions • Intel use SIMD to describe MMX extensions, only limit in the width of registers, e. g. 64 bits 35
Intel MMX Technology • MMX registers: 64 -bit MM 0 to MM 7 shared with FP registers R 0, R 7, has side-effect on FPU state, only use for operands • Four MMX data types: MMX Register 63 0 Packed Byte 8 x 8 Packed Word 16 x 4 Packed Doubleword 32 x 2 Quadword 64 • 64 -bit / 32 -bit access mode from memory to MMX registers • SIMD techniques for arithmetic/logical operations on bytes, words, doublewords from/to 64 -bit registers 36
MMX Instruction Set • MMX instruction set consists of 57 instructions, group into 7 categories: (See Intel Architecture Software Developer’s Manual Vol. 1 Basic Architecture (order#: 143190); Vol. 2 Instruction Set Ref. (order#: 243191); Vol. 3 System Programming Guide (order#: 243192) at: http: //developer. intel. com/design/archives/proces sors/mmx/index. htm – – – – Arithmetic instructions Data transfer instructions Comparison instructions Conversion instructions Logical instructions Shift instructions Empty MMX state instruction (EMMS) 37
SIMD – Parallel Operations • Conventional scalar operations vs. SIMD - PADDW A 4 B 4 A 3 B 3 A 2 B 2 A 1 B 1 A 2 A 3 + A 4+B 4 A 3+B 3 A 4 B 1 B 2 B 3 B 4 + A 1+B 1 A 2+B 2 A 3+B 3 A 4+B 4 A 2+B 2 A 1+B 1 • 4 -time faster, but require to move data in/out of the MMX registers 38
Packed Multiply Add • 4 multiplications and 2 adds in one PMADDWD instruction A 3 A 2 A 1 B 3 B 2 B 1 x A 3 x. B 3 x A 2 x. B 2 A 3 x. B 3 + A 2 x. B 2 x A 0 x B 0 A 1 x. B 1 + A 0+B 0 Source 1 Source 2 A 0 x. B 0 Intermediate Destination (Result DW) • PMADDWD produces 2 DW (32 bits) results – Useful inst. for many media and signal applications – Need arrange and pack input / output results to/from MMX registers, add programming complexity and performance overhead 39
Data Move Instructions • MOVD m 32, mm 63 xx xx 0 xx xx A 3 A 2 A 1 A 0 15 mm 0 A 3 A 2 A 1 A 0 Memory m 32 • MOVD mm, r 32 63 00 00 31 A 3 A 2 0 00 00 A 3 A 2 A 1 A 0 0 A 1 A 0 Move data between MMX registers and memory or regular register for SIMD instructions 40
- Mips code
- Instruction set principles
- Total set awareness set consideration set
- Training set validation set test set
- Isa instruction set architecture
- Ljmp instruction in 8051
- Differentiated instruction vs individualized instruction
- § 367 abgb
- Test and set instruction in os
- Classify instruction set of 8086
- Little man computer commands
- Boolean processor of 8051
- Sic instruction format
- Contoh soal set instruksi 1 alamat
- Risc instruction set example
- Marie instruction set architecture
- Picoblaze instruction set
- Intel simd instructions
- 8088 instruction set
- In 8086 microprocessor of is known as
- Classify instruction set of 8086
- Ibm 360 machine structure
- Ece 2560
- Riscv instruction set
- Lc3 architecture
- Lc3 appendix a
- Instruction format in computer architecture
- Motorola 68000 isa
- Lc3 instructions
- Avr instruction set
- Sic/xe指令
- Ia64 itanium
- Picoblaze instruction set
- Instruction set of 8085
- Dlx instruction set
- Good design demands good compromises
- Define instruction set
- Instruction set
- Cse401
- 8087 programming examples
- Sap-2 instruction set