16 482 16 561 Computer Architecture and Design

  • Slides: 78
Download presentation
16. 482 / 16. 561 Computer Architecture and Design Instructor: Dr. Michael Geiger Fall

16. 482 / 16. 561 Computer Architecture and Design Instructor: Dr. Michael Geiger Fall 2013 Lecture 1: Course overview Introduction to computer architecture

Lecture outline n Course overview q q q n Instructor information Course materials Course

Lecture outline n Course overview q q q n Instructor information Course materials Course policies Resources Course outline Introduction to computer architecture 9/24/2020 Computer Architecture Lecture 1 2

Course staff & meeting times n Lectures: n n Th 6: 30 -9: 20,

Course staff & meeting times n Lectures: n n Th 6: 30 -9: 20, Kitson 310 Instructor: Dr. Michael Geiger q q 9/24/2020 E-mail: Michael_Geiger@uml. edu Phone: 978 -934 -3618 (x 43618 on campus) Office: 118 A Perry Hall Office hours: M 1 -2: 30, W 1 -2: 30, Th 3 -4: 30 Computer Architecture Lecture 1 3

Course materials n Textbook: q David A. Patterson and John L. Hennessy, Computer Organization

Course materials n Textbook: q David A. Patterson and John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 5 th edition, 2013. n n ISBN: 9780124077263 Course tools: TBD, but will likely work with Qt. Spim simulator (link on web page) 9/24/2020 Computer Architecture Lecture 1 4

Additional course materials n Course websites: http: //mgeiger. eng. uml. edu/comp. Arch/sp 14/index. htm

Additional course materials n Course websites: http: //mgeiger. eng. uml. edu/comp. Arch/sp 14/index. htm http: //mgeiger. eng. uml. edu/comp. Arch/sp 14/schedule. htm q n Will contain lecture slides, handouts, assignments Discussion group through piazza. com: q q q 9/24/2020 Allow common questions to be answered for everyone All course announcements will be posted here Will use as class mailing list—please enroll ASAP Computer Architecture Lecture 1 5

Course policies n n Prerequisites: 16. 265 (Logic Design) and 16. 317 (Microprocessors I)

Course policies n n Prerequisites: 16. 265 (Logic Design) and 16. 317 (Microprocessors I) Academic honesty q q q All assignments are to be done individually unless explicitly specified otherwise by the instructor Any copied solutions, whether from another student or an outside source, are subject to penalty You may discuss general topics or help one another with specific errors, but do not share assignment solutions n 9/24/2020 Must acknowledge assistance from classmate in submission Computer Architecture Lecture 1 6

Grading and exam dates n Grading breakdown q q q n Homework assignments: 55%

Grading and exam dates n Grading breakdown q q q n Homework assignments: 55% Midterm exam: 20% Final exam: 25% Exam dates q q 9/24/2020 Midterm exam: Thursday, March 13 in class Final exam: Thursday, May 1 Computer Architecture Lecture 1 7

Tentative course outline n n General computer architecture introduction Instruction set architecture Digital arithmetic

Tentative course outline n n General computer architecture introduction Instruction set architecture Digital arithmetic Datapath/control design q q q n Memory hierarchy design q q n n Basic datapath Pipelining Multiple issue and instruction scheduling Caching Virtual memory Storage and I/O Multiprocessor systems 9/24/2020 Computer Architecture Lecture 1 8

What is computer architecture? n High-level description of q Computer hardware n q Interaction

What is computer architecture? n High-level description of q Computer hardware n q Interaction between software and hardware n n Less detail than logic design, more detail than black box Look at how performance can be affected by different algorithms, code translations, and hardware designs Can use to explain q q q 9/24/2020 General computation A class of computers A specific system Computer Architecture Lecture 1 9

What is computer architecture? software instruction set hardware n Classical view: instruction set architecture

What is computer architecture? software instruction set hardware n Classical view: instruction set architecture (ISA) q q n Boundary between hardware and software Provides abstraction at both high level and low level More modern view: ISA + hardware design q 9/24/2020 Can talk about processor architecture, system architecture Computer Architecture Lecture 1 10

Role of the ISA n n User writes high-level language (HLL) program Compiler converts

Role of the ISA n n User writes high-level language (HLL) program Compiler converts HLL program into assembly for the particular instruction set architecture (ISA) Assembler converts assembly into machine language (bits) for that ISA Resulting machine language program is loaded into memory and run 9/24/2020 Computer Architecture Lecture 1 11

ISA design goals The ultimate goals of the ISA designer are n To create

ISA design goals The ultimate goals of the ISA designer are n To create an ISA that allows for fast hardware implementations To simplify choices for the compiler To ensure the longevity of the ISA by anticipating future technology trends 1. 2. 3. n n Often tradeoffs (particularly between 1 & 2) Example ISAs: X 86, Power. PC, SPARC, ARM, MIPS, IA-64 q May have multiple hardware implementations of the same ISA n 9/24/2020 Example: i 386, i 486, Pentium Pro, Pentium III, Pentium IV Computer Architecture Lecture 1 12

ISA design n Think about a HLL statement like X[i] = i * 2;

ISA design n Think about a HLL statement like X[i] = i * 2; n ISA defines how such statements are translated to machine code q 9/24/2020 What information is needed? Computer Architecture Lecture 1 13

ISA design (continued) n Questions every ISA designer must answer q How will the

ISA design (continued) n Questions every ISA designer must answer q How will the processor implement this statement? n n n q Where are X[i] and i located? n n q What types of operands are supported? How big are those operands Instruction format issues n n n 9/24/2020 What operations are available? How many operands does each instruction use? How do we reference the operands? How many bits per instruction? What does each bit or set of bits represent? Are all instructions the same length? Computer Architecture Lecture 1 14

Design goal: fast hardware From ISA perspective, must understand how processor executes instruction n

Design goal: fast hardware From ISA perspective, must understand how processor executes instruction n Fetch the instruction from memory Decode the instruction Determine addresses for operands Fetch operands Execute instruction Store result (and go back to step 1 … ) 1. 2. 3. 4. 5. 6. Steps 1, 2, and 5 involve operation issues n What types of operations are supported? q Steps 2 -6 involve operand issues n Operand size, number, location q Steps 1 -3 involve instruction format issues n q 9/24/2020 How many bits in instruction, what does each field mean? Computer Architecture Lecture 1 15

Designing fast hardware n n To build a fast computer, we need 1. Fast

Designing fast hardware n n To build a fast computer, we need 1. Fast fetching and decoding of instructions 2. Fast operand access 3. Fast operation execution Two broad areas of hardware that we must optimize to ensure good performance q Datapaths pass data to different units for computation q Control determines flow of data through datapath and operation of each functional unit 9/24/2020 Computer Architecture Lecture 1 16

Designing fast hardware n Fast instruction fetch and decode q n We’ll address this

Designing fast hardware n Fast instruction fetch and decode q n We’ll address this (from an ISA perspective) today Fast operand access q q ISA classes: where do we store operands? Addressing modes: how do we specify operand locations? n q q n We’ll also discuss this today We know registers can be used for fast accesses We’ll talk about increasing memory speeds later Fast execution of simple operations q q 9/24/2020 Optimize common case Dealing with multi-cycle operations: one of our first challenges Computer Architecture Lecture 1 17

Making operations fast n n n More complex operations take longer keep things simple!

Making operations fast n n n More complex operations take longer keep things simple! Many programs contain mostly simple operations q add, and, load, branch … Optimize the common case q Make these simple operations work well! q Can execute them in a single cycle (with a fast clock, too) n 9/24/2020 If you’re wondering how, wait until we talk about pipelining. . . Computer Architecture Lecture 1 18

Specifying operands n Most common arithmetic instructions have three operands q q n 2

Specifying operands n Most common arithmetic instructions have three operands q q n 2 source operands, 1 destination operand e. g. A = B + C add A, B, C ISA classes for specifying operands: q Accumulator n n q Stack n q All 3 operands may be in memory (ADD addr 1, addr 2, addr 3) Load-Store n n 9/24/2020 Requires no explicit memory addresses (ADD) Memory-Memory n q Uses a single register (fast memory location close to processor) Requires only one address per instruction (ADD addr) All arithmetic operations use only registers (ADD r 1, r 2, r 3) Only load and store instructions reference memory Computer Architecture Lecture 1 19

Making operand access fast n Operands (generally) in one of two places q q

Making operand access fast n Operands (generally) in one of two places q q n n Memory Registers Which would we prefer to use? Why? Advantages of registers as operands q Instructions are shorter n q Fast implementation n Fewer possible locations to specify fewer bits Fast to access and easy to reuse values We’ll talk about fast memory later … Where else can operands be encoded? q q 9/24/2020 Directly in the instruction: ADDI R 1, R 2, 3 Called immediate operands Computer Architecture Lecture 1 20

RISC approach n Fixed-length instructions that have only a few formats q q Simplify

RISC approach n Fixed-length instructions that have only a few formats q q Simplify instruction fetch and decode Sacrifice code density n n n Load-store architecture q q q Allows fast implementation of simple instructions Easier to pipeline Sacrifice code density n n More instructions than register-memory and memory-memory ISAs Limited number of addressing modes q n Some bits are wasted for some instruction types Requires more memory Simplify effective address (EA) calculation to speed up memory access Few (if any) complex arithmetic functions q 9/24/2020 Build these from simpler instructions Computer Architecture Lecture 1 21

MIPS: A "Typical" RISC ISA n n 32 -bit fixed format instruction (3 formats)

MIPS: A "Typical" RISC ISA n n 32 -bit fixed format instruction (3 formats) Registers q q 32 32 -bit integer GPRs (R 1 -R 31, R 0 always = 0) 32 32 -bit floating-point GPRs (F 0 -F 31) n n n For double-precision FP, registers paired 3 -address, reg-reg arithmetic instruction Single address mode for load/store: base + displacement Simple branch conditions Delayed branch 9/24/2020 Computer Architecture Lecture 1 22

Operands n Example: load-store instructions of form: LOAD R 2, C ADD R 3,

Operands n Example: load-store instructions of form: LOAD R 2, C ADD R 3, R 1, R 2 STORE R 3, A n Three general classes of operands q Immediate operands: encoded directly in instruction n q Register operands: register number encoded in instruction, register holds data n q Example: R 1, R 2, and R 3 in above code sequence Memory operands: address encoded in instruction, data stored in memory n n 9/24/2020 Example: ADDI R 3, R 1, 25 Example: A and C in above code sequence In load-store processor, can’t directly operate on these data Computer Architecture Lecture 2 23

Memory operands n Specifying addresses is not straightforward q q q n Length of

Memory operands n Specifying addresses is not straightforward q q q n Length of address usually = length of instruction Obviously don’t want to dedicate that much space to an entire address What are some ways we might get around this? Addressing modes: different ways of specifying operand location q Already discussed two n n q 9/24/2020 Immediate: operand encoded directly in instruction Register direct: operand in register, register # in instruction For values in memory, have to calculate effective address Computer Architecture Lecture 2 24

Immediate, register direct modes n Immediate: instruction contains the operand q Example: addi R

Immediate, register direct modes n Immediate: instruction contains the operand q Example: addi R 3, R 1, 100 Instruction n operand … Register direct: register contains the operand q Example: add R 3, R 1, R 2 Instruction R 1 9/24/2020 op op R 1 … operand Computer Architecture Lecture 2 25

Register indirect mode n Register indirect: register contains address of operand q Example: lw

Register indirect mode n Register indirect: register contains address of operand q Example: lw R 2, (R 1) Memory Instruction R 1 9/24/2020 op R 1 … operand addr Computer Architecture Lecture 2 operand 26

Memory indirect mode n Memory indirect: memory contains address of operand q Allows for

Memory indirect mode n Memory indirect: memory contains address of operand q Allows for more efficient pointer implementations Instruction op R 1 … R 1 addr of operand addr 9/24/2020 Computer Architecture Lecture 2 Memory operand addr 27

Base + displacement mode n Base + displacement: Operand address = register + constant

Base + displacement mode n Base + displacement: Operand address = register + constant q Example: lw R 3, 8(R 1) Memory Instruction R 1 9/24/2020 op R 1 8 … + base address Computer Architecture Lecture 2 operand 28

PC-relative mode n n Need some way to specify instruction addresses, too PC-relative: Address

PC-relative mode n n Need some way to specify instruction addresses, too PC-relative: Address of next instruction = PC + const q PC = program counter n q The memory address of the current instruction Used in conditional branches Instruction PC 9/24/2020 op 4 … + instruction addr Computer Architecture Lecture 2 Memory next inst. 29

Instruction formats n n Instruction formats define what each bit or set of bits

Instruction formats n n Instruction formats define what each bit or set of bits means Different instructions need to specify different information q n What are the pros and cons if we have … q q n Diffferent numbers of operands, types of operands, etc. Many instruction formats + Can tailor format to specific instructions - Complicate decoding - May use more bits Few fixed, regular formats + Simpler decoding - Less flexibility in encoding One of our hardware design goals: fast decoding 9/24/2020 Computer Architecture Lecture 2 30

Instruction length n What are the pros and cons if we use … q

Instruction length n What are the pros and cons if we use … q Variable-length instructions (CISC) + No bits wasted on unused fields/operands - Complex fetch and decoding q Fixed-length instructions (RISC) + Simple fetch and decoding - Instruction encoding is less compact 9/24/2020 Computer Architecture Lecture 2 31

MIPS instruction formats n n All fixed length (32 -bit) instructions Register instructions: R-type

MIPS instruction formats n n All fixed length (32 -bit) instructions Register instructions: R-type 31 26 25 op n 20 rs 16 15 rt 11 rd 10 6 shamt 5 0 funct Immediate instructions: I-type 31 26 25 op n 21 21 rs 20 16 rt 15 0 immediate/address Jump instructions: J-type 31 26 op 9/24/2020 25 0 target (address) Computer Architecture Lecture 2 32

MIPS instruction formats (cont. ) n n Notation from the previous slide q op

MIPS instruction formats (cont. ) n n Notation from the previous slide q op is a 6 -bit operation code (opcode) q rs is a 5 -bit source register specifier q rt is a 5 -bit (source or destination) register specifier or branch condition q rd is a 5 -bit destination register specifier q shamt is a 5 -bit shift amount q funct is a 6 -bit function field q immediate is a 16 -bit immediate, branch displacement, or memory address displacement q target is a 26 -bit jump target address Simplifications q Fixed length (32 bits) q Limited number of field types q Many fields located in same location in different formats 9/24/2020 Computer Architecture Lecture 2 33

MIPS instruction fields n n Assume a MIPS instruction is represented by the hexadecimal

MIPS instruction fields n n Assume a MIPS instruction is represented by the hexadecimal value 0 x. DEADBEEF List the values for each instruction field, assuming that the instruction is q q q 9/24/2020 An R-type instruction An I-type instruction A J-type instruction Computer Architecture Lecture 2 34

MIPS instruction fields n List the values for each instruction field, assuming that 0

MIPS instruction fields n List the values for each instruction field, assuming that 0 x. DEADBEEF is q q q n An R-type instruction An I-type instruction A J-type instruction 0 x. DEADBEEF = 1101 1110 1010 1101 1011 1110 1111 9/24/2020 Computer Architecture Lecture 2 35

MIPS addressing modes n n MIPS implements several of the addressing modes discussed earlier

MIPS addressing modes n n MIPS implements several of the addressing modes discussed earlier To address operands q Immediate addressing n q Register addressing n q Example: sub $t 0, $t 1, $t 2 Base addressing (base + displacement) n n Example: addi $t 0, $t 1, 150 Example: lw $t 0, 16($t 1) To transfer control to a different instruction q PC-relative addressing n q Pseudo-direct addressing n 9/24/2020 Used in conditional branches Concatenates 26 -bit address (from J-type instruction) shifted left by 2 bits with the 4 upper bits of the PC Computer Architecture Lecture 2 36

MIPS integer registers Name Register number $zero 0 Usage Constant value 0 $v 0

MIPS integer registers Name Register number $zero 0 Usage Constant value 0 $v 0 -$v 1 2 -3 Values for results and expression evaluation $a 0 -$a 3 4 -7 Function arguments $t 0 -$t 7 8 -15 Temporary registers $s 0 -$s 7 16 -23 Callee save registers $t 8 -$t 9 24 -25 Temporary registers $gp 28 Global pointer $sp 29 Stack pointer $fp 30 Frame pointer $ra 31 Return address n List gives mnemonics used in assembly code q n Conventions q q q 9/24/2020 Can also directly reference by number ($0, $1, etc. ) $s 0 -$s 7 are preserved on a function call (callee save) Register 1 ($at) reserved for assembler Registers 26 -27 ($k 0 -$k 1) reserved for operating system Computer Architecture Lecture 2 37

Computations in MIPS n n All computations occur on full 32 bit words Computations

Computations in MIPS n n All computations occur on full 32 bit words Computations use signed (2’s complement) or unsigned operands (positive numbers) q Example: 1111 1111 is – 1 as a signed number and 4, 294, 967, 294 as an unsigned number Operands are in registers or are immediates Immediate (constant) values are only 16 bits q 32 -bit instruction must also hold opcode, source and destination register numbers q Value is sign extended before usage to 32 bits q Example: 1000 0000 becomes 1111 1000 0000 9/24/2020 Computer Architecture Lecture 2 38

MIPS instruction categories n Data transfer instructions q q n Computational instructions (arithmetic/logical) q

MIPS instruction categories n Data transfer instructions q q n Computational instructions (arithmetic/logical) q q n Example: lw, sb Always I-type Examples: add, and, sll Can be R-type or I-type Control instructions q q Example: beq, jr Any of the three formats (R-type, I-type, J-type) 9/24/2020 Computer Architecture Lecture 2 39

MIPS data transfer instructions n Memory operands can be bytes, half-words (2 bytes), or

MIPS data transfer instructions n Memory operands can be bytes, half-words (2 bytes), or words (4 bytes [32 bits]) q n opcode determines the operand size Half-word and word addresses must be aligned q q q Divisible by number of bytes being accessed Bit 0 must be zero for half-word accesses Bits 0 and 1 must be zero for word accesses Byte address Aligned word 0 4 8 12 Unaligned half-word 16 20 9/24/2020 Aligned half-word Computer Architecture Lecture 2 Unaligned word 40

Byte order (“endianness”) n n In a multi-byte operand, how are the bytes ordered

Byte order (“endianness”) n n In a multi-byte operand, how are the bytes ordered in memory? Assume the value 1, 000 (0 x. F 4240) is stored at address 80 q In a big-endian machine, the most significant byte (the “big” end) is at address 80 00 0 F 42 40 … 79 80 81 82 83 84 … q In a little-endian machine, it’s the other way around 40 42 0 F 00 … 79 80 81 82 83 84 … 9/24/2020 Computer Architecture Lecture 2 41

Big-endian vs. little-endian n Big-endian systems q n Little-endian systems q n n MIPS,

Big-endian vs. little-endian n Big-endian systems q n Little-endian systems q n n MIPS, Sparc, Motorola 68000 Most Intel processors, Alpha, VAX Neither one is “better; ” it’s simply a matter of preference … … but there are compatibility issues that arise when transferring data from one to the other 9/24/2020 Computer Architecture Lecture 2 42

MIPS data transfer instructions n(cont. ) For all cases, calculate effective address first q

MIPS data transfer instructions n(cont. ) For all cases, calculate effective address first q q n lb, lh, lw q q n Get data from addressed memory location Sign extend if lb or lh, load into rt lbu, lhu, lwu q q n MIPS doesn’t use segmented memory model like x 86 Flat memory model EA = address being accessed Get data from addressed memory location Zero extend if lb or lh, load into rt sb, sh, sw q Store data from rt (partial if sb or sh) into addressed location 9/24/2020 Computer Architecture Lecture 2 43

Data transfer examples n n Say memory holds the word 0 x. ABCD 1234,

Data transfer examples n n Say memory holds the word 0 x. ABCD 1234, starting at address 0 x 1000, $t 0 holds the value 0 x 1000, and $s 0 holds 0 x. DEADBEEF What are the results of the following instructions? q q q 9/24/2020 lh $t 1, 2($t 0) lb $t 2, 1($t 0) lbu $t 3, 0($t 0) sh $s 0, 0($t 0) sb $s 0, 3($t 0) Computer Architecture Lecture 2 44

Solutions to examples n If mem[0 x 1000] = 0 x. ABCD 1234, $t

Solutions to examples n If mem[0 x 1000] = 0 x. ABCD 1234, $t 0 holds the value 0 x 1000, and $s 0 holds 0 x. DEADBEEF q lh $t 1, 2($t 0) n q lb $t 2, 1($t 0) n q n Change 16 bits at address 0 x 1000 mem[0 x 1000] = 0 x. BEEF 1234 sb $s 0, 3($t 0) n n 9/24/2020 $t 3 = mem[0 x 1000] = 0 x 000000 AB sh $s 0, 0($t 0) n q $t 2 = mem[0 x 1001] = 0 x. FFFFFFCD lbu $t 3, 0($t 0) n q $t 1 = mem[0 x 1002] = 0 x 00001234 Change 8 bits at address 0 x 1003 mem[0 x 1000] = 0 x. ABCD 12 EF Computer Architecture Lecture 2 45

More data transfer examples n n If A is an array of words and

More data transfer examples n n If A is an array of words and the starting address of A (memory address of A[0]) is in $s 4, perform the operation: A[3] = A[0] + A[1] - A[2] Solution: lw $s 1, 0($s 4) # A[0] into $s 1 lw $s 2, 4($s 4) # A[1] into $s 2 add $s 1, $s 2 # A[0]+A[1] into $s 1 lw $s 2, 8($s 4) # A[2] into $s 2 sub $s 1, $s 2 # A[0]+A[1]-A[2] into $s 1 sw $s 1, 12($s 4) # store result in A[3] 9/24/2020 Computer Architecture Lecture 2 46

MIPS computational instructions n Arithmetic q q q Signed: add, sub, mult, div Unsigned:

MIPS computational instructions n Arithmetic q q q Signed: add, sub, mult, div Unsigned: addu, subu, multu, divu Immediate: addi, addiu n n Logical q q and, or, nor, xor andi, ori, xori n n Immediates are sign-extended (why? ) Immediates are zero-extended (why? ) Shift (logical and arithmetic) q srl, sll – shift right (left) logical n n n q sra – shift right arithmetic n 9/24/2020 Shift the value in rs by shamt digits to right or left Fill empty positions with 0 s Store the result in rd Same as above, but sign-extend the high-order bits Computer Architecture Lecture 2 47

Signed vs. unsigned computation n What’s the difference between add and addu? q q

Signed vs. unsigned computation n What’s the difference between add and addu? q q Result looks exactly the same addu ignores overflow n n n q add used for most normal computation addu used for memory addresses C language ignores overflow compiler generates unsigned computation 4 -bit example: (-8) + (-8) = (-16) can’t represent 16 with 4 bits 1000 + 10000 9/24/2020 Computer Architecture Lecture 2 48

MIPS computational instructions (cont. ) n Set less than q Used to evaluate conditions

MIPS computational instructions (cont. ) n Set less than q Used to evaluate conditions n q slt, sltu n q Condition is rs < rt slti, sltiu n n n Set rd to 1 if condition is met, set to 0 otherwise Condition is rs < immediate Immediate is sign-extended Load upper immediate (lui) q Shift immediate 16 bits left, append 16 zeros to right, put 32 -bit result into rd 9/24/2020 Computer Architecture Lecture 2 49

Examples of arithmetic instructions Instruction Meaning add $s 1, $s 2, $s 3 $s

Examples of arithmetic instructions Instruction Meaning add $s 1, $s 2, $s 3 $s 1 = $s 2 + $s 3 3 registers; signed addition sub $s 1, $s 2, $s 3 $s 1 = $s 2 - $s 3 3 registers; signed subtraction addu $s 1, $s 2, $s 3 $s 1 = $s 2 + $s 3 3 registers; unsigned addition addi $s 1, $s 2, 50 $s 1 = $s 2 + 50 2 registers and immediate; signed addiu $s 1, $s 2, 50 $s 1 = $s 2 + 50 2 registers and immediate; unsigned 9/24/2020 Computer Architecture Lecture 2 50

Examples of logical instructions Instruction Meaning and $s 1, $s 2, $s 3 $s

Examples of logical instructions Instruction Meaning and $s 1, $s 2, $s 3 $s 1 = $s 2 & $s 3 3 registers; logical AND or $s 1, $s 2, $s 3 $s 1 = $s 2 | $s 3 3 registers; logical OR xor $s 1, $s 2, $s 3 $s 1 = $s 2 �$s 3 3 registers; logical XOR nor $s 1, $s 2, $s 3 $s 1 = ~($s 2 + $s 3) 3 registers; logical NOR 9/24/2020 Computer Architecture Lecture 2 51

Computational instruction examples n Say $t 0 = 0 x 00000001, $t 1 =

Computational instruction examples n Say $t 0 = 0 x 00000001, $t 1 = 0 x 00000004, n $t 2 = 0 x. FFFF What are the results of the following instructions? q q q 9/24/2020 sub $t 3, $t 1, $t 0 addi $t 4, $t 1, 0 x. FFFF andi $t 5, $t 2, 0 x. FFFF sll $t 6, $t 0, 5 slt $t 7, $t 0, $t 1 lui $t 8, 0 x 1234 Computer Architecture Lecture 2 52

Solutions to examples n If $t 0 = 0 x 00000001, $t 1 =

Solutions to examples n If $t 0 = 0 x 00000001, $t 1 = 0 x 00000004, $t 2 = 0 x. FFFF q sub $t 3, $t 1, $t 0 n q addi $t 4, $t 1, 0 x. FFFF n q n $t 7 = 1 if ($t 0 < $t 1) $t 0 = 0 x 00000001, $t 1 = 0 x 00000004 $t 7 = 1 lui $t 8, 0 x 1234 n 9/24/2020 $t 6 = 0 x 00000001 << 5 = 0 x 00000020 slt $t 7, $t 0, $t 1 n q $t 5 = 0 x. FFFF AND 0 x 0000 FFFF = 0 x 0000 FFFF sll $t 6, $t 0, 5 n q $t 4 = 0 x 00000004 + 0 x. FFFF = 0 x 00000003 andi $t 5, $t 2, 0 x. FFFF n q $t 3 = 0 x 00000004 – 0 x 00000001 = 0 x 00000003 $t 8 = 0 x 1234 << 16 = 0 x 12340000 Computer Architecture Lecture 2 53

MIPS control instructions n Branch instructions test a condition q Equality or inequality of

MIPS control instructions n Branch instructions test a condition q Equality or inequality of rs and rt n n q Value of rs relative to rt n n beq, bne Often coupled with slt, sltu, sltiu Pseudoinstructions: blt, bgt, ble, bge Target address add sign extended immediate to the PC q Since all instructions are words, immediate is shifted left two bits before being sign extended 9/24/2020 Computer Architecture Lecture 2 54

Pseudoinstructions n Assembler recognizes certain “instructions” that aren’t actually part of MIPS ISA q

Pseudoinstructions n Assembler recognizes certain “instructions” that aren’t actually part of MIPS ISA q q q n Common operations that can be implemented using other relatively simple operations Easier to read and write assembly in terms of these pseudoinstructions Example: MIPS only has beq, bne instructions, but assembler recognizes bgt, bge, blt, ble Assembler converts pseudoinstruction into actual instruction(s) q q 9/24/2020 If extra register needed, use $at Example: bgt $t 0, $t 1, label slt $at, $t 1, $t 0 bne $at, $zero, label Computer Architecture Lecture 2 55

MIPS control instructions (cont. ) n Jump instructions unconditionally branch to the address formed

MIPS control instructions (cont. ) n Jump instructions unconditionally branch to the address formed by either q Shifting left the 26 -bit target two bits and combining it with the 4 high-order PC bits n q The contents of register $rs n n jr Branch-and-link and jump-and-link instructions also save the address of the next instruction into $ra q q q 9/24/2020 j jal Used for subroutine calls jr $ra used to return from a subroutine Computer Architecture Lecture 2 56

Compiling If Statements n C code: if (i==j) f = g+h; else f =

Compiling If Statements n C code: if (i==j) f = g+h; else f = g-h; q n f, g, … in $s 0, $s 1, … Compiled MIPS code: bne add j Else: sub Exit: … $s 3, $s 4, Else $s 0, $s 1, $s 2 Exit $s 0, $s 1, $s 2 Assembler calculates addresses 9/24/2020 Computer Architecture Lecture 2 57

Compiling Loop Statements n C code: while (save[i] == k) i += 1; q

Compiling Loop Statements n C code: while (save[i] == k) i += 1; q n i in $s 3, k in $s 5, address of save in $s 6 Compiled MIPS code: Loop: sll add lw bne addi j Exit: … 9/24/2020 $t 1, $t 0, $s 3, Loop $s 3, 2 $t 1, $s 6 0($t 1) $s 5, Exit $s 3, 1 Computer Architecture Lecture 2 58

n Steps required 1. 2. 3. 4. 5. 6. Place parameters in registers Transfer

n Steps required 1. 2. 3. 4. 5. 6. Place parameters in registers Transfer control to procedure Acquire storage for procedure Perform procedure’s operations Place result in register for caller Return to place of call Chapter 2 — Instructions: Language of the Computer — 59 § 2. 8 Supporting Procedures in Computer Hardware Procedure Calling

Register Usage n n n $a 0 – $a 3: arguments (reg’s 4 –

Register Usage n n n $a 0 – $a 3: arguments (reg’s 4 – 7) $v 0, $v 1: result values (reg’s 2 and 3) $t 0 – $t 9: temporaries q n $s 0 – $s 7: saved q n n Can be overwritten by callee Must be saved/restored by callee $gp: global pointer for static data (reg 28) $sp: stack pointer (reg 29) $fp: frame pointer (reg 30) $ra: return address (reg 31) Chapter 2 — Instructions: Language of the Computer — 60

Procedure Call Instructions n Procedure call: jump and link jal Procedure. Label q q

Procedure Call Instructions n Procedure call: jump and link jal Procedure. Label q q n Address of following instruction put in $ra Jumps to target address Procedure return: jump register jr $ra q q Copies $ra to program counter Can also be used for computed jumps n e. g. , for case/switch statements Chapter 2 — Instructions: Language of the Computer — 61

Leaf Procedure Example n C code: int leaf_example (int g, h, i, j) {

Leaf Procedure Example n C code: int leaf_example (int g, h, i, j) { int f; f = (g + h) - (i + j); return f; } q q q Arguments g, …, j in $a 0, …, $a 3 f in $s 0 (hence, need to save $s 0 on stack) Result in $v 0 Chapter 2 — Instructions: Language of the Computer — 62

Leaf Procedure Example n MIPS code: leaf_example: addi $sp, -4 sw $s 0, 0($sp)

Leaf Procedure Example n MIPS code: leaf_example: addi $sp, -4 sw $s 0, 0($sp) add $t 0, $a 1 add $t 1, $a 2, $a 3 sub $s 0, $t 1 add $v 0, $s 0, $zero lw $s 0, 0($sp) addi $sp, 4 jr $ra Chapter 2 — Instructions: Language of the Computer — 63 Save $s 0 on stack Procedure body Result Restore $s 0 Return

Non-Leaf Procedures n n Procedures that call other procedures For nested call, caller needs

Non-Leaf Procedures n n Procedures that call other procedures For nested call, caller needs to save on the stack: q q n Its return address Any arguments and temporaries needed after the call Restore from the stack after the call Chapter 2 — Instructions: Language of the Computer — 64

Non-Leaf Procedure Example n C code: int fact (int n) { if (n <

Non-Leaf Procedure Example n C code: int fact (int n) { if (n < 1) return f; else return n * fact(n - 1); } q q Argument n in $a 0 Result in $v 0 Chapter 2 — Instructions: Language of the Computer — 65

Non-Leaf Procedure Example n MIPS code: fact: addi sw sw slti beq addi jr

Non-Leaf Procedure Example n MIPS code: fact: addi sw sw slti beq addi jr L 1: addi jal lw lw addi mul jr $sp, $ra, $a 0, $t 0, $v 0, $sp, $ra $a 0, fact $a 0, $ra, $sp, $v 0, $ra Chapter 2 — Instructions: Language of the Computer — 66 $sp, -8 4($sp) 0($sp) $a 0, 1 $zero, L 1 $zero, 1 $sp, 8 $a 0, -1 0($sp) 4($sp) $sp, 8 $a 0, $v 0 # # adjust stack for 2 items save return address save argument test for n < 1 # # # # # if so, result is 1 pop 2 items from stack and return else decrement n recursive call restore original n and return address pop 2 items from stack multiply to get result and return

Local Data on the Stack n Local data allocated by callee q n e.

Local Data on the Stack n Local data allocated by callee q n e. g. , C automatic variables Procedure frame (activation record) q Used by some compilers to manage stack storage Chapter 2 — Instructions: Language of the Computer — 67

Memory Layout n n Text: program code Static data: global variables q q n

Memory Layout n n Text: program code Static data: global variables q q n Dynamic data: heap q n e. g. , static variables in C, constant arrays and strings $gp initialized to address allowing ±offsets into this segment E. g. , malloc in C, new in Java Stack: automatic storage Chapter 2 — Instructions: Language of the Computer — 68

String Copy Example n C code (naïve): q Null-terminated string void strcpy (char x[],

String Copy Example n C code (naïve): q Null-terminated string void strcpy (char x[], char y[]) { int i; i = 0; while ((x[i]=y[i])!='') i += 1; } q q Addresses of x, y in $a 0, $a 1 i in $s 0 Chapter 2 — Instructions: Language of the Computer — 69

String Copy Example n MIPS code: strcpy: addi sw add L 1: add lbu

String Copy Example n MIPS code: strcpy: addi sw add L 1: add lbu add sb beq addi j L 2: lw addi jr $sp, $s 0, $t 1, $t 2, $t 3, $t 2, $s 0, L 1 $s 0, $sp, $ra Chapter 2 — Instructions: Language of the Computer — 70 $sp, -4 0($sp) $zero, $zero $s 0, $a 1 0($t 1) $s 0, $a 0 0($t 3) $zero, L 2 $s 0, 1 0($sp) $sp, 4 # # # # adjust stack for 1 item save $s 0 i = 0 addr of y[i] in $t 1 $t 2 = y[i] addr of x[i] in $t 3 x[i] = y[i] exit loop if y[i] == 0 i = i + 1 next iteration of loop restore saved $s 0 pop 1 item from stack and return

n n Illustrates use of assembly instructions for a C bubble sort function Swap

n n Illustrates use of assembly instructions for a C bubble sort function Swap procedure (leaf) q void swap(int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; } v in $a 0, k in $a 1, temp in $t 0 Chapter 2 — Instructions: Language of the Computer — 71 § 2. 13 A C Sort Example to Put It All Together C Sort Example

The Procedure Swap swap: sll $t 1, $a 1, 2 # $t 1 =

The Procedure Swap swap: sll $t 1, $a 1, 2 # $t 1 = k * 4 add $t 1, $a 0, $t 1 # $t 1 = v+(k*4) # (address of v[k]) lw $t 0, 0($t 1) # $t 0 (temp) = v[k] lw $t 2, 4($t 1) # $t 2 = v[k+1] sw $t 2, 0($t 1) # v[k] = $t 2 (v[k+1]) sw $t 0, 4($t 1) # v[k+1] = $t 0 (temp) jr $ra # return to calling routine Chapter 2 — Instructions: Language of the Computer — 72

The Sort Procedure in C n Non-leaf (calls swap) q void sort (int v[],

The Sort Procedure in C n Non-leaf (calls swap) q void sort (int v[], int n) { int i, j; for (i = 0; i < n; i += 1) { for (j = i – 1; j >= 0 && v[j] > v[j + 1]; j -= 1) { swap(v, j); } } } v in $a 0, k in $a 1, i in $s 0, j in $s 1 Chapter 2 — Instructions: Language of the Computer — 73

The Procedure Body move for 1 tst: slt beq addi for 2 tst: slti

The Procedure Body move for 1 tst: slt beq addi for 2 tst: slti bne sll add lw lw slt beq move jal addi j exit 2: addi j Chapter 2 — Instructions: $s 2, $a 0 $s 3, $a 1 $s 0, $zero $t 0, $s 3 $t 0, $zero, exit 1 $s 1, $s 0, – 1 $t 0, $s 1, 0 $t 0, $zero, exit 2 $t 1, $s 1, 2 $t 2, $s 2, $t 1 $t 3, 0($t 2) $t 4, 4($t 2) $t 0, $t 4, $t 3 $t 0, $zero, exit 2 $a 0, $s 2 $a 1, $s 1 swap $s 1, – 1 for 2 tst $s 0, 1 for 1 tst Language of the Computer — 74 # # # # # # save $a 0 into $s 2 save $a 1 into $s 3 i = 0 $t 0 = 0 if $s 0 ≥ $s 3 (i ≥ n) go to exit 1 if $s 0 ≥ $s 3 (i ≥ n) j = i – 1 $t 0 = 1 if $s 1 < 0 (j < 0) go to exit 2 if $s 1 < 0 (j < 0) $t 1 = j * 4 $t 2 = v + (j * 4) $t 3 = v[j] $t 4 = v[j + 1] $t 0 = 0 if $t 4 ≥ $t 3 go to exit 2 if $t 4 ≥ $t 3 1 st param of swap is v (old $a 0) 2 nd param of swap is j call swap procedure j –= 1 jump to test of inner loop i += 1 jump to test of outer loop Move params Outer loop Inner loop Pass params & call Inner loop Outer loop

The Full Procedure sort: addi $sp, – 20 sw $ra, 16($sp) sw $s 3,

The Full Procedure sort: addi $sp, – 20 sw $ra, 16($sp) sw $s 3, 12($sp) sw $s 2, 8($sp) sw $s 1, 4($sp) sw $s 0, 0($sp) … … exit 1: lw $s 0, 0($sp) lw $s 1, 4($sp) lw $s 2, 8($sp) lw $s 3, 12($sp) lw $ra, 16($sp) addi $sp, 20 jr $ra Chapter 2 — Instructions: Language of the Computer — 75 # # # # make room on stack for 5 registers save $ra on stack save $s 3 on stack save $s 2 on stack save $s 1 on stack save $s 0 on stack procedure body # # # # restore $s 0 from stack restore $s 1 from stack restore $s 2 from stack restore $s 3 from stack restore $ra from stack restore stack pointer return to calling routine

Synchronization n Two processors sharing an area of memory q q P 1 writes,

Synchronization n Two processors sharing an area of memory q q P 1 writes, then P 2 reads Data race if P 1 and P 2 don’t synchronize n n Hardware support required q q n Result depends of order of accesses Atomic read/write memory operation No other access to the location allowed between the read and write Could be a single instruction q q E. g. , atomic swap of register ↔ memory Or an atomic pair of instructions Chapter 2 — Instructions: Language of the Computer — 76

Synchronization in MIPS n n Load linked: ll rt, offset(rs) Store conditional: sc rt,

Synchronization in MIPS n n Load linked: ll rt, offset(rs) Store conditional: sc rt, offset(rs) q Succeeds if location not changed since the ll n q Fails if location is changed n n Returns 1 in rt Returns 0 in rt Example: atomic swap (to test/set lock variable) try: add ll sc beq add Chapter 2 — Instructions: Language of the Computer — 77 $t 0, $zero, $s 4 $t 1, 0($s 1) $t 0, $zero, try $s 4, $zero, $t 1 ; copy exchange value ; load linked ; store conditional ; branch store fails ; put load value in $s 4

Final notes n Next time: q q n More on MIPS ISA MIPS instruction

Final notes n Next time: q q n More on MIPS ISA MIPS instruction set Announcements/reminders: q q 9/24/2020 Sign up for the course discussion group on Piazza! HW 1 to be posted; due 1/30 Computer Architecture Lecture 1 78