Computer Systems CPU Jakub Yaghob Von Neumann architecture
Computer Systems CPU Jakub Yaghob
Von Neumann architecture l Simple, slower CPU Memory I/O
Harvard architecture l l Microcontrollers Multiple address spaces Instruction Memory Data Memory CPU I/O
Real PC architecture Sandy Bridge PCI express × 16 2133 -1066 MHz DDRIII Channel 1 DDRIII Mem BUS Cache Memory controller Line out Line in S/PDIF out S/PDIF in Core GFX Core Channel 2 System Agent Display link Audio Codec 4×DMI South Bridge (PCH) Serial Port Super I/O PCI express × 1 LPC USB Floppy Drive D-sub, HDMI, DVI, Display port exp slots BIOS Parallel Port l External Graphics Card PS/2 keybrd/ mouse SATA DVD Drive SATA Hard Disk Lan Adap LAN
CPU l Architecture l l l HW ISA "Simple" machine l Executes instructions l Instruction – simple command
Instructions - motivation l How can we execute the following code? if(a<3) b = 4; else c = a << 2; for(int i=0; i<5; ++i) a[i] = i; int f(int p) { return p+1; } void g() { auto r = f(42); }
Instruction classes l l l Load instructions Store instructions Move instruction Arithmetic and logic instructions Jumps l l Unconditional x conditional Direct x indirect x relative Call, return …
Registers l Types l l Naming l l General, integer, floating point, address, branch, flags, predicate, application, system, vector, … Direct x stack Aliasing
Registers – example 32 -bit x 86 EAX AX AH AL CS EBX BX BH BL DS ECX CX CH CL ES EDX DX DH DL SS ESI SI FS EDI DI GS EBP BP EFLAGS ESP SP EIP FLAGS IP
Registers – example IA-64
MIPS – simple assembler l Execution environment l 32 -bit registers r 0 -r 31 l l l r 0 is always 0, writes are ignored r 31 is a link register for the jal instruction No stack No flags PC register
MIPS – register aliases Register Name Purpose Preserve $r 0 $zero 0 N/A $r 1 $at Assembler temporary No $r 2 -$r 3 $v 0 -$v 1 Return value No $r 4 -$r 7 $a 0 -$a 3 Function arguments No $r 8 -$r 15 $t 0 -$t 7 Temporaries No $r 16 -$r 23 $s 0 -$s 7 Saved temporaries Yes $r 24 -$r 25 $t 8 -$t 9 Temporaries No $r 26 -$r 27 $k 0 -$k 1 Kernel registers – DO NOT USE N/A $r 28 $gp Global pointer Yes $r 29 $sp Stack pointer Yes $r 30 $fp Frame pointer Yes $r 31 $ra Return address Yes
MIPS – instructions l Arithmetic l add $rd, $rs, $rt l l addi $rd, $rs, imm 16 l l l R[rd] = R[rs]+R[rt] R[rd] = R[rs]+signext(imm 16) sub $rd, $rs, $rt subi $rd, $rs, imm 16
ISA comparison MIPS ADD $t 1, $t 0 ADDI $t 1, 1 x 86 ADD eax, ebx ADD eax, 1 ADD MOV eax, ebx ADD eax, ecx $t 2, $t 0, $t 1
MIPS – instructions l Logic operations l l and/or/xor/nor $rd, $rs, $rt andi/ori/xori $rd, $rs, imm 16 l l l R[rd] = R[rs] and/or/xor zeroext(imm 16) No not instruction, use nor $rd, $rs Shifts l sll/slr $rd, $rs, shamt l l R[rd] = R[rs] << / >> shamt sra $rd, $rs, shamt
ISA comparison MIPS NOR $t 1, $t 2 x 86 MOV eax, ebx NOT eax SLL $t 1, 3 SHL eax, 3
MIPS – instructions l Memory access l lw $rd, imm 16($rs) l l sw $rt, imm 16($rs) l l R[rd] = zeroext 32(M[R[rs] + signext 32(imm 16)]) sb $rt, imm 16($rs) l l R[rd] = signext 32(M[R[rs] + signext 32(imm 16)]) lbu $rd, imm 16($rs) l l M[R[rs] + signext 32(imm 16)] = R[rt] lb $rd, imm 16($rs) l l R[rd] = M[R[rs] + signext 32(imm 16)] = R[rt] Moves l li $rd, imm 32 l l R[rd] = imm 32 move $rd, $rs l R[rd] = R[rs]
ISA comparison MIPS LW SW LB LI MOVE $t 1, 1234($t 0) $t 1, 5678 $t 1, $t 0 x 86 MOV MOV MOV eax, [ebx+1234], eax al, [ebx+1234] eax, 5678 eax, ebx
MIPS – instructions l Jumps l j addr l l jr $rs l l PC = addr PC = R[rs] jal addr l R[31] = PC+4; PC = addr
ISA comparison MIPS J label JR $ra x 86 JMP label 1 JMP [ebx] JAL fnc CALL fnc
MIPS – instructions l Conditional jumps l beq $rs, $rt, addr l l l If R[rs]=R[rt] then PC=addr else PC=PC+4 bne $rs, $rt, addr Testing l slt $rd, $rs, $rt l l sltu $rd, $rs, $rt l l Unsigned version slti $rd, $rs, imm 16 l l If R[rs]<R[rt] then R[rd] = 1 else R[rd] = 0 If R[rs]<signext(imm 16) then R[rd] = 1 else R[rd] = 0 sltiu $rd, $rs, imm 16 l If R[rs]<zeroext(imm 16) then R[rd] = 1 else R[rd] = 0
ISA comparison MIPS BEQ $t 0, $t 1, label x 86 CMP eax, ebx JZ label SLT $t 2, $t 1, $t 0 BNE $t 2, $zero, label CMP eax, ebx JL label SLTI $t 2, $t 1, 5 BNE $t 2, $zero, label CMP eax, 5 JL label
Flags l l Only used by some ISA Control execution Check status of the last instruction Usual flags l l l Z – zero flag S – sign flag C – carry flag
CPU l Architecture l l Memory controller Cache hierarchy Core Registers l l Logical processor l l Types Hyper threading Instructions
Instruction l l l Simple command to the CPU Encoding Assembler Operands Instruction flow l l PC Stack? l SP
ISA l Instruction set architecture l l Classification l l l CISC – Complex Instruction Set Computer RISC – Reduced Instruction Set Computer VLIW – Very Long Instruction Word EPIC – Explicitly Parallel Instruction Computer Orthogonality l l Abstract model of CPU Accumulator Load-Execute-Store
CPU – simplified scheme T 0 T 4 T 1 T 5 T 2 T 6 T 3 T 7 EU EU L 1 I L 1 D L 2 L 2 CORE 0 CORE 1 CORE 2 CORE 3 L 3/LLC Package
Real CPU scheme – package l Intel Coffee Lake
Real CPU scheme – core
Real CPU die
CPU architecture – pipeline l Current CPU l 14 -19 stages
CPU architecture – superscalar processor l Current CPU l 5 -way, asymmetric
CPU architecture – out-oforder execution Decoder µOPs Reservation station (pool) Port 0 Port 1 Port 5 Port 6 Port 2 Port 3 Port 4 Port 7 I/V ALU I ALU AGU AGU I/V MUL Vec Shuff I Logic Load Store I/V Logic F ADD Branch String Bit scan Comp Int F FMA AES SQRT I/F DIV Branch Reorder buffer
- Slides: 33