The RISCV Processor Hakim Weatherspoon CS 3410 Computer

  • Slides: 95
Download presentation
The RISC-V Processor Hakim Weatherspoon CS 3410 Computer Science Cornell University [Weatherspoon, Bala, Bracy,

The RISC-V Processor Hakim Weatherspoon CS 3410 Computer Science Cornell University [Weatherspoon, Bala, Bracy, and Sirer]

Announcements Check online syllabus/schedule • http: //www. cs. cornell. edu/Courses/CS 3410/2019 sp/schedule • Slides

Announcements Check online syllabus/schedule • http: //www. cs. cornell. edu/Courses/CS 3410/2019 sp/schedule • Slides and Reading for lectures • Office Hours • Pictures of all TAs • Dates to keep in Mind • • Prelims: Tue Mar 5 th and Thur May 2 nd Proj 1: Due next Friday, Feb 15 th Proj 3: Due before Spring break Final Project: Due when final will be Feb 16 th Schedule is subject to change 2

Collaboration, Late, Re-grading Policies • “White Board” Collaboration Policy • Can discuss approach together

Collaboration, Late, Re-grading Policies • “White Board” Collaboration Policy • Can discuss approach together on a “white board” • Leave, watch a movie such as Stranger Things, then write up solution independently • Do not copy solutions Late Policy • Each person has a total of four “slip days” • Max of two slip days for any individual assignment • Slip days deducted first for any late assignment, cannot selectively apply slip days • For projects, slip days are deducted from all partners • 25% deducted per day late after slip days are exhausted Regrade policy • Submit written request within a week of receiving score 3

Big Picture: Building a Processor memory +4 inst register file +4 =? PC control

Big Picture: Building a Processor memory +4 inst register file +4 =? PC control offset new pc alu cmp addr din dout memory target imm extend A single cycle processor 4

Goal for the next 2 lectures • Understanding the basics of a processor •

Goal for the next 2 lectures • Understanding the basics of a processor • We now have the technology to build a CPU! • Putting it all together: • Arithmetic Logic Unit (ALU) • Register File • Memory - SRAM: cache - DRAM: main memory • RISC-V Instructions & how they are executed 5 5

RISC-V Register File memory +4 inst register file +4 =? PC control offset new

RISC-V Register File memory +4 inst register file +4 =? PC control offset new pc alu cmp addr din dout memory target imm extend A single cycle processor 6

RISC-V Register File • RISC-V register file • 32 registers, 32 -bits each •

RISC-V Register File • RISC-V register file • 32 registers, 32 -bits each • x 0 wired to zero • Write port indexed via RW - on falling edge when WE=1 • Read ports indexed via RA, RB 32 DW QA 32 Dual-Read-Port Single-Write-Port QB 32 32 x 32 Register File WE 1 RW RA RB 5 5 5 7

RISC-V Register File • RISC-V register file • 32 registers, 32 -bits each •

RISC-V Register File • RISC-V register file • 32 registers, 32 -bits each • x 0 wired to zero • Write port indexed via RW 32 x 0 x 1 … x 31 W - on falling edge when WE=1 • Read ports indexed via RA, RB • RISC-V register file WE 1 A B 32 32 RW RA RB 5 • Numbered from 0 to 31 • Can be referred by number: x 0, x 1, x 2, … x 31 • Convention, each register also has a name: 5 5 - x 10 – x 17 a 0 – a 7, x 28 – x 31 t 3 – t 6 8

i. Clicker Question If we wanted to support 64 registers, what would change? a)

i. Clicker Question If we wanted to support 64 registers, what would change? a) b) c) d) W, A, B → 64 RW , R A , R B 5 → 6 W 32 → 64, RW 5 → 6 A & B only 32 x 0 x 1 … x 31 W WE 1 A B 32 32 RW RA RB 5 5 5 9

i. Clicker Question If we wanted to support 64 registers, what would change? a)

i. Clicker Question If we wanted to support 64 registers, what would change? a) b) c) d) W, A, B → 64 RW , R A , R B 5 → 6 W 32 → 64, RW 5 → 6 A & B only 32 x 0 x 1 … x 31 W WE 1 A B 32 32 RW RA RB 5 5 5 10

RISC-V Memory memory +4 inst register file +4 =? PC control offset new pc

RISC-V Memory memory +4 inst register file +4 =? PC control offset new pc alu cmp addr din dout memory target imm extend A single cycle processor 11

RISC-V Memory Din 32 Dout memory 2 32 addr mc 32 E • 32

RISC-V Memory Din 32 Dout memory 2 32 addr mc 32 E • 32 -bit address • 32 -bit data (but byte addressed) • Enable + 2 bit memory control (mc) 00: read word (4 byte aligned) 01: write byte 10: write halfword (2 byte aligned) 11: write word (4 byte aligned) 1 byte 0 x 05 address 0 x 000 fffff. . . 0 x 0000000 b 0 x 0000000 a 0 x 00000009 0 x 00000008 0 x 00000007 0 x 00000006 0 x 00000005 0 x 00000004 0 x 00000003 0 x 00000002 0 x 00000001 0 x 0000 12

Putting it all together: Basic Processor memory +4 inst register file +4 =? PC

Putting it all together: Basic Processor memory +4 inst register file +4 =? PC control offset new pc alu cmp addr din dout memory target imm extend A single cycle processor 13

To make a computer Need a program • Stored program computer Architectures • von

To make a computer Need a program • Stored program computer Architectures • von Neumann architecture • Harvard (modified) architecture 14

To make a computer Need a program • Stored program computer • (a Universal

To make a computer Need a program • Stored program computer • (a Universal Turing Machine) Architectures • von Neumann architecture • Harvard (modified) architecture 15

Putting it all together: Basic Processor A RISC-V CPU with a (modified) Harvard architecture

Putting it all together: Basic Processor A RISC-V CPU with a (modified) Harvard architecture • Modified: instructions & data in common address space, separate instr/data caches can be accessed in parallel Registers ALU CPU Control data, address, control 1010000 10110000011 0010101. . . Program Memory 00100000001 001000000100. . . Data Memory 16

Takeaway A processor executes instructions • Processor has some internal state in storage elements

Takeaway A processor executes instructions • Processor has some internal state in storage elements (registers) A memory holds instructions and data • (modified) Harvard architecture: separate insts and data • von Neumann architecture: combined inst and data A bus connects the two We now have enough building blocks to build machines that can perform non-trivial computational tasks 17

Next Goal • How to program and execute instructions on a RISC-V processor? 18

Next Goal • How to program and execute instructions on a RISC-V processor? 18

Instruction Usage Instructions are stored in memory, encoded in binary A basic processor •

Instruction Usage Instructions are stored in memory, encoded in binary A basic processor • fetches • decodes • executes one instruction at a time 10 x 2 x 0 op=addi 0000101000000010011 00100000000000100001100000101010 addr data pc cur inst adder decode regs execute 19

Instruction Processing Prog inst Mem +4 PC Reg. File ALU Data Mem 555 control

Instruction Processing Prog inst Mem +4 PC Reg. File ALU Data Mem 555 control Instructions: stored in memory, encoded in binary 00100000000000001 010 0010000000000000100001100000101 010 A basic processor • fetches • decodes • executes one instruction at a time 20

Levels of Interpretation: Instructions for (i = 0; i < 10; i++) printf(“go cucs”);

Levels of Interpretation: Instructions for (i = 0; i < 10; i++) printf(“go cucs”); High Level Language • • C, Java, Python, ADA, … Loops, control flow, variable Assembly Language main: addi x 2, x 0, 10 addi x 1, x 0, 0 loop: slt x 3, x 1, x 2. . . 10 x 2 x 0 op=addi 0000101000000010011 00100000000000100001100000101010 • No symbols (except labels) • One operation per statement • “human readable machine language” Machine Language • Binary-encoded assembly • Labels become addresses • The language of the Instruction Set CPU Architecture. Machine ALU, Control, Register File, … Implementation 21

Instruction Set Architecture (ISA) Different CPU architectures specify different instructions Two classes of ISAs

Instruction Set Architecture (ISA) Different CPU architectures specify different instructions Two classes of ISAs • Reduced Instruction Set Computers (RISC) IBM Power PC, Sun Sparc, MIPS, Alpha • Complex Instruction Set Computers (CISC) Intel x 86, PDP-11, VAX Another ISA classification: Load/Store Architecture • Data must be in registers to be operated on For example: array[x] = array[y] + array[z] 1 add ? OR 2 loads, an add, and a store ? • Keeps HW simple many RISC ISAs are load/store 22

i. Clicker Question What does it mean for an architecture to be called a

i. Clicker Question What does it mean for an architecture to be called a load/store architecture? (A)Load and Store instructions are supported by the ISA. (B)Load and Store instructions can also perform arithmetic instructions on data in memory. (C)Data must first be loaded into a register before it can be operated on. (D)Every load must have an accompanying store at some later point in the program. 23

i. Clicker Question What does it mean for an architecture to be called a

i. Clicker Question What does it mean for an architecture to be called a load/store architecture? (A)Load and Store instructions are supported by the ISA. (B)Load and Store instructions can also perform arithmetic instructions on data in memory. (C)Data must first be loaded into a register before it can be operated on. (D)Every load must have an accompanying store at some later point in the program. 24

Takeaway A RISC-V processor and ISA (instruction set architecture) is an example a Reduced

Takeaway A RISC-V processor and ISA (instruction set architecture) is an example a Reduced Instruction Set Computers (RISC) where simplicity is key, thus enabling us to build it!! 25

Next Goal How are instructions executed? What is the general datapath to execute an

Next Goal How are instructions executed? What is the general datapath to execute an instruction? 26

Five Stages of RISC-V Datapath Prog. inst Mem +4 PC Fetch ALU Reg. File

Five Stages of RISC-V Datapath Prog. inst Mem +4 PC Fetch ALU Reg. File Data Mem 555 control Decode Execute Memory WB A single cycle processor – this diagram is not 100% 27

Five Stages of RISC-V Datapath Basic CPU execution loop 1. 2. 3. 4. 5.

Five Stages of RISC-V Datapath Basic CPU execution loop 1. 2. 3. 4. 5. Instruction Fetch Instruction Decode Execution (ALU) Memory Access Register Writeback 28

Stage 1: Instruction Fetch Prog. inst Mem +4 PC Fetch ALU Reg. File Data

Stage 1: Instruction Fetch Prog. inst Mem +4 PC Fetch ALU Reg. File Data Mem 555 control Decode Execute Memory Fetch 32 -bit instruction from memory Increment PC = PC + 4 WB 29

Stage 2: Instruction Decode Prog. inst Mem +4 PC Reg. File ALU Data Mem

Stage 2: Instruction Decode Prog. inst Mem +4 PC Reg. File ALU Data Mem 555 control Fetch Memory WB Decod Execute e Gather data from the instruction Read opcode; determine instruction type, field lengths Read in data from register file 30

Stage 3: Execution (ALU) Prog. inst Mem +4 PC Fetch ALU Reg. File Data

Stage 3: Execution (ALU) Prog. inst Mem +4 PC Fetch ALU Reg. File Data Mem 555 control Decode Execute Memory WB Useful work done here (+, -, *, /), shift, logic operation, comparison (slt) Load/Store? lw x 2, x 3, 32 Compute address 31

Stage 4: Memory Access Prog. inst Mem +4 PC Fetch ALU Reg. File Data

Stage 4: Memory Access Prog. inst Mem +4 PC Fetch ALU Reg. File Data Mem Data 555 R/W control Decode addr Execute Memory WB Used by load and store instructions only Other instructions will skip this stage 32

Stage 5: Writeback Prog. inst Mem +4 Data Mem 555 PC Fetch ALU Reg.

Stage 5: Writeback Prog. inst Mem +4 Data Mem 555 PC Fetch ALU Reg. File control Decode Execute Memory WB Write to register file • For arithmetic ops, logic, shift, etc, load. What about stores? • For branches, jumps Update PC 33

i. Clicker Question Which of the following statements is true? (A) All instructions require

i. Clicker Question Which of the following statements is true? (A) All instructions require an access to Program Memory. (B) All instructions require an access to Data Memory. (C) All instructions write to the register file. (D) Some RISC-V instructions are shorter than 32 bits (E) A & C 34

i. Clicker Question Which of the following statements is true? (A) All instructions require

i. Clicker Question Which of the following statements is true? (A) All instructions require an access to Program Memory. (B) All instructions require an access to Data Memory. (C) All instructions write to the register file. (D) Some RISC-V instructions are shorter than 32 bits (E) A & C 35

Takeaway • The datapath for a RISC-V processor has five stages: 1. 2. 3.

Takeaway • The datapath for a RISC-V processor has five stages: 1. 2. 3. 4. 5. Instruction Fetch Instruction Decode Execution (ALU) Memory Access Register Writeback • This five stage datapath is used to execute all RISC-V instructions 36

Next Goal • Specific datapaths RISC-V Instructions 37

Next Goal • Specific datapaths RISC-V Instructions 37

RISC-V Design Principles Simplicity favors regularity • 32 bit instructions Smaller is faster •

RISC-V Design Principles Simplicity favors regularity • 32 bit instructions Smaller is faster • Small register file Make the common case fast • Include support for constants Good design demands good compromises • Support for different type of interpretations/classes 38

Instruction Types • Arithmetic • add, subtract, shift left, shift right, multiply, divide •

Instruction Types • Arithmetic • add, subtract, shift left, shift right, multiply, divide • Memory • load value from memory to a register • store value to memory from a register • Control flow • conditional jumps (branches) • jump and link (subroutine call) • Many other instructions are possible • vector add/sub/mul/div, string operations • manipulate coprocessor • I/O 39

RISC-V Instruction Types • Arithmetic/Logical • R-type: result and two source registers, shift amount

RISC-V Instruction Types • Arithmetic/Logical • R-type: result and two source registers, shift amount • I-type: result and source register, shift amount in 16 -bit immediate with sign/zero extension • U-type: result register, 16 -bit immediate with sign/zero extension • Memory Access • I-type for loads and S-type for stores • load/store between registers and memory • word, half-word and byte operations • Control flow • UJ-type: jump-and-link • I-type: jump-and-link register • SB-type: conditional branches: pc-relative addresses 40

RISC-V instruction formats All RISC-V instructions are 32 bits long, have 4 formats •

RISC-V instruction formats All RISC-V instructions are 32 bits long, have 4 formats • R-type 31 25 24 funct 7 rs 2 7 bits • I-type 2019 15 14 2019 imm 12 bits 7 6 rs 1 Funct 3 Rd 5 bits 31 12 11 3 bits 15 14 op 5 bits 7 bits 12 11 7 6 Rs 1 Funct 3 rd 5 bits 3 bits 0 0 op 5 bits 7 bits • S-type 31 25 24 2019 15 14 12 11 7 6 (SB-type) imm rs 2 rs 1 funct 3 imm Op • U-type (UJ-type) 7 bits 5 bits 31 3 bits 5 bits 7 bits 12 11 imm 20 bits 0 rd 7 6 0 op 5 bits 7 bits 41

R-Type (1): Arithmetic and Logic 00000110010000110011 31 25 24 funct 7 rs 2 7

R-Type (1): Arithmetic and Logic 00000110010000110011 31 25 24 funct 7 rs 2 7 bits op 20 19 15 14 12 11 7 6 rs 1 Funct 3 Rd 5 bits funct 3 mnemonic 3 bits 0 op 5 bits 7 bits description 0110011 000 0110011 110 ADD rd, rs 1, rs 2 R[rd] = R[rs 1] + R[rs 2] SUB rd, rs 1, rs 2 R[rd] = R[rs 1] – R[rs 2] OR rd, rs 1, rs 2 R[rd] = R[rs 1] | R[rs 2] 0110011 100 XOR rd, rs 1, rs 2 R[rd] = R[rs 1] R[rs 2] 42

R-Type (1): Arithmetic and Logic 00000110010000110011 31 25 24 funct 7 rs 2 7

R-Type (1): Arithmetic and Logic 00000110010000110011 31 25 24 funct 7 rs 2 7 bits op 20 19 15 14 12 11 7 6 rs 1 Funct 3 Rd 5 bits funct 3 mnemonic 3 bits 0 op 5 bits 7 bits description 0110011 000 0110011 110 ADD rd, rs 1, rs 2 R[rd] = R[rs 1] + R[rs 2] SUB rd, rs 1, rs 2 R[rd] = R[rs 1] – R[rs 2] OR rd, rs 1, rs 2 R[rd] = R[rs 1] | R[rs 2] 0110011 100 XOR rd, rs 1, rs 2 R[rd] = R[rs 1] R[rs 2] Example: x 4 = x 8 x 6 # XOR x 4, x 8, x 6 rd, rs 1, rs 2 43

Arithmetic and Logic XOR x 4, x 8, x 6 Prog. Mem ALU Reg.

Arithmetic and Logic XOR x 4, x 8, x 6 Prog. Mem ALU Reg. File x 8 x 6 +4 555 PC control XOR x 4 x 8 x 6 Fetch Decode Execute Memory WB skip Example: x 4 = x 8 x 6 # XOR x 4, x 8, x 6 rd, rs 1, rs 2 44

R-Type (2): Shift Instructions 00000110000101000011 31 25 24 20 19 funct 7 rs 2

R-Type (2): Shift Instructions 00000110000101000011 31 25 24 20 19 funct 7 rs 2 7 bits 15 14 12 11 7 6 rs 1 Funct 3 Rd 5 bits 3 bits 0 op 5 bits 7 bits op funct 3 mnemonic description 0110011 001 SLL rd, rs 1, rs 2 R[rd] = R[rs 1] << R[rs 2] 0110011 101 SRL rd, rs 1, rs 2 R[rd] = R[rs 1] >>> R[rs 2] (zero ext. ) 0110011 101 SRA rd, rs 1, rs 2 R[rd] = R[rt] >>> R[rs 2] (sign ext. ) 45

R-Type (2): Shift Instructions 00000110000101000011 31 25 24 20 19 funct 7 rs 2

R-Type (2): Shift Instructions 00000110000101000011 31 25 24 20 19 funct 7 rs 2 7 bits 15 14 12 11 7 6 rs 1 Funct 3 Rd 5 bits 3 bits 0 op 5 bits 7 bits op funct 3 mnemonic description 0110011 001 SLL rd, rs 1, rs 2 R[rd] = R[rs 1] << R[rs 2] 0110011 101 SRL rd, rs 1, rs 2 R[rd] = R[rs 1] >>> R[rs 2] (zero ext. ) 0110011 101 SRA rd, rs 1, rs 2 R[rd] = R[rt] >>> R[rs 2] (sign ext. ) Example: x 8 = x 4 * 2 x 6 # SLL x 8, x 4, x 6 x 8 = x 4 << x 6 46

Shift SLL x 8, x 4, x 6 Prog. Mem Reg. File ALU x

Shift SLL x 8, x 4, x 6 Prog. Mem Reg. File ALU x 4 << x 6 +4 PC Fetch 555 control SLL x 8 x 4 x 6 Execute Decode Memory WB skip Example: x 8 = x 4 * 2 x 6 # SLL x 8, x 4, x 6 x 8 = x 4 << x 6 47

I-Type (1): Arithmetic w/ immediates 00000101000001010010011 31 20 19 imm 12 bits 15 14

I-Type (1): Arithmetic w/ immediates 00000101000001010010011 31 20 19 imm 12 bits 15 14 12 11 7 6 rs 1 funct 3 rd 5 bits 3 bits 0 op 5 bits 7 bits op funct 3 mnemonic description 0010011 000 ADDI rd, rs 1, imm R[rd] = R[rs 1] + sign_extend(imm) 0010011 111 ANDI rd, rs 1, imm R[rd] = R[rs 1] & sign_extend(imm) 0010011 110 ORI rd, rs 1, imm R[rd] = R[rs 1] | sign_extend(imm) 48

I-Type (1): Arithmetic w/ immediates 00000101000001010010011 31 20 19 imm 12 bits 15 14

I-Type (1): Arithmetic w/ immediates 00000101000001010010011 31 20 19 imm 12 bits 15 14 12 11 7 6 rs 1 funct 3 rd 5 bits 3 bits 0 op 5 bits 7 bits op funct 3 mnemonic 0010011 000 ADDI rd, rs 1, imm R[rd] = R[rs 1] + sign_extend(imm) 0010011 111 ANDI rd, rs 1, imm R[rd] = R[rs 1] & sign_extend(imm) 0010011 110 ORI rd, rs 1, imm Example: x 5 = x 5 + 5 x 5 += 5 description R[rd] = R[rs 1] | sign_extend(imm) # ADDI x 5, 5 49

Arithmetic w/ immediates Prog. Mem ALU Reg. File +4 555 PC control imm 16

Arithmetic w/ immediates Prog. Mem ALU Reg. File +4 555 PC control imm 16 extend 12 shamt Fetch Decode Execute Memory skip Example: x 5 = x 5 + 5 # ADDI x 5, 5 WB 50

Arithmetic w/ immediates ADDI x 5, 5 Prog. Mem ALU Reg. File x 5

Arithmetic w/ immediates ADDI x 5, 5 Prog. Mem ALU Reg. File x 5 +4 PC 555 control ADDI x 5 5 imm 12 extend 32 shamt Fetch Decode Execute Memory skip Example: x 5 = x 5 + 5 # ADDI x 5, 5 WB 51

i. Clicker Question • To compile the code y = z + 1, assuming

i. Clicker Question • To compile the code y = z + 1, assuming y is stored in X 1 and z is stored in X 2, you can use the ADDI instruction. What is the largest number for which we can continue to use ADDI? (a)12 (b)212 -1 = 4, 095 (c) 212 -1 -1 = 2, 047 (d)216 -1 = 65, 535 (e)232 -1 = ~4. 3 billion 52

i. Clicker Question • To compile the code y = z + 1, assuming

i. Clicker Question • To compile the code y = z + 1, assuming y is stored in X 1 and x is stored in X 2, you can use the ADDI instruction. What is the largest number for which we can continue to use ADDI? (a)12 (b)212 -1 = 2, 047 (c) 212 -1 = 4, 095 (d)216 -1 = 65, 535 (e)232 -1 = ~4. 3 billion 53

54

54

55

55

“ ” U-Type (1): Load Upper Immediate 0000000001010110111 31 12 11 imm 7 6

“ ” U-Type (1): Load Upper Immediate 0000000001010110111 31 12 11 imm 7 6 rd 0 op op mnemonic ST R O W 5 bits 7 bits NAME ! R E description EV 0110111 LUI rd, imm R[rd] = sign_ext(imm) << 12 20 bits 56

“ ” U-Type (1): Load Upper Immediate 0000000001010110111 31 12 11 imm 7 6

“ ” U-Type (1): Load Upper Immediate 0000000001010110111 31 12 11 imm 7 6 rd 0 op op mnemonic ST R O W 5 bits 7 bits NAME ! R E description EV 0110111 LUI rd, imm R[rd] = sign_ext(imm) << 12 20 bits Example: x 5 = 0 x 5000 # LUI x 5, 5 Example: LUI x 5, 0 xbeef 1 ADDI x 5, x 5 0 x 234 What does x 5 = 0 xbeef 1234 ? 57

Load Upper Immediate Prog. Mem ALU Reg. File +4 555 PC control imm 20

Load Upper Immediate Prog. Mem ALU Reg. File +4 555 PC control imm 20 extend 32 shamt Fetch Decode Execute Example: x 5 = 0 x 5000 Memory skip WB # LUI x 5, 5 58

Load Upper Immediate LUI x 5, 5 Prog. Mem ALU Reg. File 0 x

Load Upper Immediate LUI x 5, 5 Prog. Mem ALU Reg. File 0 x 5000 +4 PC 555 control LUI x 5 5 imm 20 12 extend 32 shamt Fetch Decode Execute Example: x 5 = 0 x 5000 Memory skip WB # LUI x 5, 5 59

RISC-V Instruction Types • Arithmetic/Logical • R-type: result and two source registers, shift amount

RISC-V Instruction Types • Arithmetic/Logical • R-type: result and two source registers, shift amount • I-type: result and source register, shift amount in 16 -bit immediate with sign/zero extension • U-type: result register, 16 -bit immediate with sign/zero extension • Memory Access • I-type for loads and S-type for stores • load/store between registers and memory • word, half-word and byte operations • Control flow • U-type: jump-and-link • I-type: jump-and-link register • SB-type: conditional branches: pc-relative addresses 60

I-Type (2): Load Instructions 0000010000101010000011 31 20 19 15 14 12 11 7 6

I-Type (2): Load Instructions 0000010000101010000011 31 20 19 15 14 12 11 7 6 0 imm 12 bits op funct 3 rs 1 funct 3 rd 5 bits 3 bits op base + offset addressing 5 bits 7 bits mnemonic Description 0000011 000 LB rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 001 LH rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 010 LW rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 LD rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 100 LBU rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 101 LHU rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 110 LWU rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] signed offsets 61

I-Type (2): Load Instructions 0000010000101010000011 31 20 19 15 14 12 11 7 6

I-Type (2): Load Instructions 0000010000101010000011 31 20 19 15 14 12 11 7 6 0 imm 12 bits op funct 3 rs 1 funct 3 rd 5 bits 3 bits op base + offset addressing 5 bits 7 bits mnemonic Description 0000011 000 LB rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 001 LH rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 010 LW rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 LD rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 100 LBU rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 101 LHU rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 110 LWU rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] Example: x 1 = Mem[4+x 5] # LW x 1, x 5, 4 LW x 1 4(x 5) signed offsets 62

I-Type (2): Load Instructions 0000010000101010000011 31 20 19 15 14 12 11 7 6

I-Type (2): Load Instructions 0000010000101010000011 31 20 19 15 14 12 11 7 6 0 imm 12 bits op funct 3 rs 1 funct 3 rd 5 bits 3 bits op base + offset addressing 5 bits 7 bits mnemonic Description 0000011 000 LB rd, rs 1, imm R[rd] = sign_ext(Mem[imm+R[rs 1]]) 0000011 001 LH rd, rs 1, imm R[rd] = sign_ext(Mem[imm+R[rs 1]]) 0000011 010 LW rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 LD rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] 0000011 100 LBU rd, rs 1, imm R[rd] = zero_ext(Mem[imm+R[rs 1]]) 0000011 101 LHU rd, rs 1, imm R[rd] = zero_ext(Mem[imm+R[rs 1]]) 0000011 110 LWU rd, rs 1, imm R[rd] = Mem[imm+R[rs 1]] Example: x 1 = Mem[4+x 5] # LW x 1, x 5, 4 LW x 1 4(x 5) signed offsets 63

Memory Operations: Load Prog. Mem ALU Reg. File addr +4 555 PC Data Mem

Memory Operations: Load Prog. Mem ALU Reg. File addr +4 555 PC Data Mem control Write Enable imm 16 extend 12 Example: x 1 = Mem[4+x 5] # LW x 1, x 5, 4 LW x 1 4(x 5) 64

Memory Operations: Load LW x 1, x 5, 4 Prog. Mem ALU 4+x 5

Memory Operations: Load LW x 1, x 5, 4 Prog. Mem ALU 4+x 5 Reg. File addr +4 PC 555 control LW x 1 x 5 4 imm 12 Fetch Data Mem[4+x 5 Mem Decode Write Enable extend 32 Execute Memory Example: x 1 = Mem[4+x 5] # LW x 1, x 5, 4 LW x 1 4(x 5) WB 65

S-Type (1): Store Instructions 000010000010101000010011 31 op 25 24 20 19 imm rs 2

S-Type (1): Store Instructions 000010000010101000010011 31 op 25 24 20 19 imm rs 2 7 bits 5 bits 15 14 12 11 76 rs 1 funct 3 imm funct 3 mnemonic 3 bits 0 Op 5 bits 7 bits base + offset addressing description 0100011 000 SB rs 2, rs 1, imm Mem[sign_ext(imm)+R[rs 1]] = R[rd] 0100011 001 SH rs 2, rs 1, imm Mem[sign_ext(imm)+R[rs 1]] = R[rd] 0100011 010 SW rs 2, rs 1, imm Mem[sign_ext(imm)+R[rs 1]] = R[rd] signed offsets 66

S-Type (1): Store Instructions 000010000010101000010011 31 op 25 24 20 19 imm rs 2

S-Type (1): Store Instructions 000010000010101000010011 31 op 25 24 20 19 imm rs 2 7 bits 5 bits 15 14 12 11 76 rs 1 funct 3 imm 3 bits funct 3 mnemonic 0 Op 5 bits 7 bits base + offset addressing description 0100011 000 SB rs 2, rs 1, imm Mem[sign_ext(imm)+R[rs 1]] = R[rd] 0100011 001 SH rs 2, rs 1, imm Mem[sign_ext(imm)+R[rs 1]] = R[rd] 0100011 010 SW rs 2, rs 1, imm Mem[sign_ext(imm)+R[rs 1]] = R[rd] signed offsets Example: Mem[128+x 5] = x 1 # SW x 1, x 5, 128 SW x 1 128(x 5) 67

Memory Operations: Load SW x 1, x 5, 128 Prog. Mem ALU 128+x 5

Memory Operations: Load SW x 1, x 5, 128 Prog. Mem ALU 128+x 5 Reg. File addr +4 PC 555 control SW x 1 x 5 128 imm 12 Fetch Data Mem Decode Write Enable extend 32 Execute Memory Example: Mem[4+x 5] = x 1 # SW x 1, x 5, 128 SW x 1 128(x 5) WB 68

Memory Layout Options • # x 5 contains 5 (0 x 00000005) • SB

Memory Layout Options • # x 5 contains 5 (0 x 00000005) • SB x 5, x 0, 0 • SB x 5, x 0, 2 • SW x 5, x 0, 8 • Two ways to store a word in memory. Endianness: ordering of bytes within a memory word 0 x 000 fffff. . . 0 x 0000000 b 0 x 0000000 a 0 x 00000009 0 x 00000008 0 x 00000007 0 x 00000006 0 x 00000005 0 x 00000004 0 x 00000003 0 x 00000002 0 x 00000001 0 x 0000 69

Little Endianness: Ordering of bytes within a memory word Little Endian = least significant

Little Endianness: Ordering of bytes within a memory word Little Endian = least significant part first (RISC-V, x 86) 1000 1001 1002 1003 as 4 bytes as 2 halfwords 0 x 12345678 as 1 word Clicker Question: What values go in the byte-sized boxes with addresses 1000 and 1001? a) 0 x 8, 0 x 7 d) 0 x 12, 0 x 34 b) 0 x 78, 0 x 56 e) 0 x 1, 0 x 2 c) 0 x 87, 0 x 65 THIS IS WHAT YOUR PROJECTS 70 WILL BE 70

Little Endianness: Ordering of bytes within a memory word Little Endian = least significant

Little Endianness: Ordering of bytes within a memory word Little Endian = least significant part first (RISC-V, x 86) 1000 1001 1002 1003 as 4 bytes as 2 halfwords 0 x 12345678 as 1 word Clicker Question: What values go in the byte-sized boxes with addresses 1000 and 1001? a) 0 x 8, 0 x 7 d) 0 x 12, 0 x 34 b) 0 x 78, 0 x 56 e) 0 x 1, 0 x 2 c) 0 x 87, 0 x 65 THIS IS WHAT YOUR PROJECTS 71 WILL BE 71

Little Endianness: Ordering of bytes within a memory word Big Endian = most significant

Little Endianness: Ordering of bytes within a memory word Big Endian = most significant part first (MIPS, networks) 1000 1001 1002 1003 as 4 bytes as 2 halfwords 0 x 12345678 as 1 word Clicker Question: What value goes in the half-word sized box with address 1000? a) 0 x 1 d) 0 x 4321 b) 0 x 12 e) 0 x 5678 c) 0 x 1234 THIS IS WHAT YOUR PROJECTS 72 WILL BE 72

Little Endianness: Ordering of bytes within a memory word Big Endian = most significant

Little Endianness: Ordering of bytes within a memory word Big Endian = most significant part first (MIPS, networks) 1000 1001 1002 1003 as 4 bytes as 2 halfwords 0 x 12345678 as 1 word Clicker Question: What value goes in the half-word sized box with address 1000? a) 0 x 1 d) 0 x 4321 b) 0 x 12 e) 0 x 5678 c) 0 x 1234 THIS IS WHAT YOUR PROJECTS 73 WILL BE 73

Little Endian = least significant part first (RISC-V, x 86) WHAT WE USE IN

Little Endian = least significant part first (RISC-V, x 86) WHAT WE USE IN 3410 0 x 000 fffff. . . 0 x 0000000 b Example: r 5 contains 5 (0 x 00000005) SW r 5, 8(r 0) 0 x 0000000 a Clicker Question: After executing the store, which byte address contains the byte 0 x 05? a) 0 x 00000008 b) 0 x 00000009 c) 0 x 0000000 a d) 0 x 0000000 b e) I don’t know 0 x 00000006 0 x 00000009 0 x 00000008 0 x 00000007 0 x 00000005 0 x 00000004 0 x 00000003 0 x 00000002 0 x 00000001 0 x 0000 74

Little Endian = least significant part first (RISC-V, x 86) WHAT WE USE IN

Little Endian = least significant part first (RISC-V, x 86) WHAT WE USE IN 3410 0 x 000 fffff. . . 0 x 0000000 b Example: r 5 contains 5 (0 x 00000005) SW r 5, 8(r 0) 0 x 0000000 a Clicker Question: After executing the store, which byte address contains the byte 0 x 05? a) 0 x 00000008 b) 0 x 00000009 c) 0 x 0000000 a d) 0 x 0000000 b e) I don’t know 0 x 00000006 0 x 00000009 0 x 00000008 0 x 00000007 0 x 00000005 0 x 00000004 0 x 00000003 0 x 00000002 0 x 00000001 0 x 0000 75

Big Endian = most significant part first (some MIPS, networks) 0 x 000 fffff.

Big Endian = most significant part first (some MIPS, networks) 0 x 000 fffff. . . 0 x 0000000 b Example: r 5 contains 5 (0 x 00000005) SW r 5, 8(r 0) Clicker Question: After executing the store, which byte address contains the byte 0 x 05? a) 0 x 00000008 b) 0 x 00000009 c) 0 x 0000000 a d) 0 x 0000000 b e) I don’t know 0 x 0000000 a 0 x 00000009 0 x 00000008 0 x 00000007 0 x 00000006 0 x 00000005 0 x 00000004 0 x 00000003 0 x 00000002 0 x 00000001 0 x 0000 76

Big Endian = most significant part first (some MIPS, networks) 0 x 000 fffff.

Big Endian = most significant part first (some MIPS, networks) 0 x 000 fffff. . . 0 x 0000000 b Example: r 5 contains 5 (0 x 00000005) SW r 5, 8(r 0) Clicker Question: After executing the store, which byte address contains the byte 0 x 05? a) 0 x 00000008 b) 0 x 00000009 c) 0 x 0000000 a d) 0 x 0000000 b e) I don’t know 0 x 0000000 a 0 x 00000009 0 x 00000008 0 x 00000007 0 x 00000006 0 x 00000005 0 x 00000004 0 x 00000003 0 x 00000002 0 x 00000001 0 x 0000 77

Big Endian Memory Layout x 0 0 x 000 fffff . . . 0

Big Endian Memory Layout x 0 0 x 000 fffff . . . 0 x 00000005 x 5 0 x 00000005 x 6 0 x 0000 x 7 0 x 00000005 x 8 • • • SB x 5, x 0, 2 LB x 6, x 0, 2 SW x 5, x 0, 8 LB x 7, x 0, 8 LB x 8, x 0, 11 0 x 05 0 x 0000000 b 0 x 0000000 a 0 x 00000009 0 x 00000008 0 x 00000007 0 x 00000006 0 x 00000005 0 x 00000004 0 x 00000003 0 x 05 0 x 00000002 0 x 00000001 0 x 0000 78

RISC-V Instruction Types • Arithmetic/Logical • R-type: result and two source registers, shift amount

RISC-V Instruction Types • Arithmetic/Logical • R-type: result and two source registers, shift amount • I-type: result and source register, shift amount in 16 -bit immediate with sign/zero extension • U-type: result register, 16 -bit immediate with sign/zero extension • Memory Access • I-type for loads and S-type for stores • load/store between registers and memory • word, half-word and byte operations • Control flow • U-type: jump-and-link • I-type: jump-and-link register • S-type: conditional branches: pc-relative addresses 79

UJ-Type (2): Jump and Link 00000000100000101111 31 12 11 7 6 0 imm rd

UJ-Type (2): Jump and Link 00000000100000101111 31 12 11 7 6 0 imm rd op 20 bits 5 bits 7 bits op Mnemonic Description 1101111 JAL rd, imm R[rd] = PC+4; PC=PC + sext(imm) Example: x 5 = PC+4 # JAL x 5, 16 PC = PC + 16 (i. e. 16 == 8<<1) Why? Function/procedure calls 80

Jump and Link Prog. Mem ALU Reg. File addr +4 555 PC Data Mem

Jump and Link Prog. Mem ALU Reg. File addr +4 555 PC Data Mem control Write Enable imm 12 extend 32 Example: x 5 = PC+4 # JAL x 5, 16 PC = PC + 16 (i. e. 16 == 8<<1) 81

Jump and Link JAL x 5, 16 Prog. Mem ALU Reg. File addr +4

Jump and Link JAL x 5, 16 Prog. Mem ALU Reg. File addr +4 555 PC Data Mem control Write Enable + imm extend Could have used ALU for JAL add Example: x 5 = PC+4 # JAL x 5, 16 PC = PC + 16 (i. e. 16 == 8<<1) 82

I-Type (3): Jump and Link Register 0000000100000001011100111 31 20 19 imm 12 bits 15

I-Type (3): Jump and Link Register 0000000100000001011100111 31 20 19 imm 12 bits 15 14 12 11 7 6 rs 1 funct 3 rd 5 bits 3 bits 0 op 5 bits 7 bits op funct 3 Mnemonic Description 1100111 000 JALR rd, rs 1, imm R[rd] = PC+4; PC=(R[rs 1]+sign_ex(imm))&0 xfffffffe Example: x 5 = PC+4 PC = x 4 + 16 # JALR x 5, x 4, 16 Why? Function/procedure calls 83

Jump and Link Register Prog. Mem ALU Reg. File addr +4 555 PC Data

Jump and Link Register Prog. Mem ALU Reg. File addr +4 555 PC Data Mem control Write Enable + imm extend Example: x 5 = PC+4 PC = x 4 + 16 # JALR x 5, x 4, 16 84

Jump and Link Register JALR x 5, x 4, 16 Prog. Mem ALU Reg.

Jump and Link Register JALR x 5, x 4, 16 Prog. Mem ALU Reg. File addr +4 555 PC Data Mem control Write Enable + imm extend x 4 + 16 Example: x 5 = PC+4 PC = x 4 + 16 # JALR x 5, x 4, 16 85

Moving Beyond Jumps • Can use Jump and link (JAL) or Jump and Link

Moving Beyond Jumps • Can use Jump and link (JAL) or Jump and Link Register (JALR) instruction to jump to 0 xabcd 1234 What about a jump based on a condition? • # assume 0 <= x 3 <= 1 • if (x 3 == 0) jump to 0 xdecafe 00 else jump to 0 xabcd 1234 86

SB-Type (2): Branches 00000100001010000010011 31 25 24 20 19 imm rs 2 7 bits

SB-Type (2): Branches 00000100001010000010011 31 25 24 20 19 imm rs 2 7 bits 5 bits 15 14 12 11 76 rs 1 funct 3 imm 3 bits 0 Op 5 bits 7 bits signed op mnemonic description 110001 1 BEQ rs 1, rs 2, imm PC=(R[rs 1] == R[rs 2] ? PC+sext(imm)<<1 : PC+4) 110001 1 BNE rs 1, rs 2, imm PC=(R[rs 1] != R[rs 2] ? PC+sext(imm)<<1 : PC+4) Example: BEQ x 5, x 1, 128 if(R[x 5]==R[x 1]) PC = PC + 128 (i. e. 128 == 64<<1) A word about all these +’s… 87

Control Flow: Branches Prog. Mem ALU Reg. File addr +4 555 PC Data Mem

Control Flow: Branches Prog. Mem ALU Reg. File addr +4 555 PC Data Mem control Write Enable + imm extend Example: BEQ x 5, x 1, 128 88

Control Flow: Branches BEQ x 5, x 1, 128 Prog. Mem ALU Reg. File

Control Flow: Branches BEQ x 5, x 1, 128 Prog. Mem ALU Reg. File addr +4 555 PC =? control BEQ Data Mem Write Enable + (PC+64<<1 imm extend Could have used ALU for branch add Example: BEQ x 5, x 1, 128 Could have used ALU for branch cmp 89

SB-Type (3): Conditional Jumps 000000010100000010011 31 op 25 24 20 19 imm rs 2

SB-Type (3): Conditional Jumps 000000010100000010011 31 op 25 24 20 19 imm rs 2 7 bits 5 bits funct 3 15 14 12 11 76 rs 1 funct 3 imm 3 bits 0 Op 5 bits 7 bits mnemonic description 1100011 100 BLT rs 1, rs 2, imm PC=(R[rs 1] <s R[rs 2] ? PC + sext(imm)<<1 : PC+4) 1100011 101 BGE rs 1, rs 2, imm PC=(R[rs 1] >=s R[rs 2] ? PC + sext(imm)<<1 : PC+4) 1100011 110 BLTU rs 1, rs 2 imm PC=(R[rs 1] <u R[rs 2] ? PC + sext(imm)<<1 : PC+4) 1100011 111 BGEU rs 1, rs 2, imm PC=(R[rs 1] >=u R[rs 2] ? PC + sext(imm)<<1 : PC+4) Example: BGE x 5, x 0, 32 if(R[x 5] ≥s R[x 0]) PC = PC + 32 (i. e. 32 == 16<<1) 90

Control Flow: More Branches BGE x 5, x 0, 32 Prog. Mem ALU Reg.

Control Flow: More Branches BGE x 5, x 0, 32 Prog. Mem ALU Reg. File addr +4 PC 555 offset =? control cmp BGE + imm extend PC+16<<1 Data Mem Write Enable Could have used ALU for branch cmp Example: BGE x 5, x 0, 32 91

RISC-V Instruction Types • Arithmetic/Logical • R-type: result and two source registers, shift amount

RISC-V Instruction Types • Arithmetic/Logical • R-type: result and two source registers, shift amount • I-type: result and source register, shift amount in 16 -bit immediate with sign/zero extension • U-type: result register, 16 -bit immediate with sign/zero extension ✔ • Memory Access • I-type for loads and S-type for stores • load/store between registers and memory • word, half-word and byte operations ✔ • Control flow ✔ • U-type: jump-and-link • I-type: jump-and-link register • S-type: conditional branches: pc-relative addresses 92

i. Clicker Question What RISC-V instruction would you use for a: 1. For loop?

i. Clicker Question What RISC-V instruction would you use for a: 1. For loop? 2. While loop? 3. Function call? 4. If statement? 5. Return statement? (A)Jump and Link Register (JALR lr, x 2, 0 x 000 FFFF) (B) Branch Equals (BEQ x 1, x 2, 0 x. AAAA) (C) Branch Less Than (BLT x 1, x 2, 0 x. AAAA) (D)Jump and Link (JAL lr, 0 x 000 FFFF) 93

i. Clicker Question • What is the one topic you’re most uncertain about at

i. Clicker Question • What is the one topic you’re most uncertain about at this point in the class? (A) Gates & Logic (B) Circuit Simplification (C) Finite State Machines (D) RISC-V Processor (E) RISC-V Assembly 94

Summary We have all that it takes to build a processor! • Arithmetic Logic

Summary We have all that it takes to build a processor! • Arithmetic Logic Unit (ALU) • Register File • Memory RISC-V processor and ISA is an example of a Reduced Instruction Set Computers (RISC) • Simplicity is key, thus enabling us to build it! We now know the data path for the MIPS ISA: • register, memory and control instructions 95