Interactions between Processor Design and Memory System Design
Interactions between Processor Design and Memory System Design David E. Culler CS 61 CL Nov 25, 2009 Lecture 12 11/4/25 UCB CS 61 CL F 09 Lec 12 1
A Processor Centric View Processor Memory Datapath Control 11/4/09 UCB CS 61 CL F 09 Lec 10 2
Fundamental Mem. Design concepts • Caches • Virtual memory • Without these, processing as we know it would not be possible 11/4/25 UCB CS 61 CL F 09 Lec 12 3
A more balanced view Memory Processor • “Princeton Architecture” – common instruction and data memory 11/4/25 UCB CS 61 CL F 09 Lec 12 4
A more balanced view Instruction Memory Data Memory Processor • “Harvard Architecture” – separate instruction and data memory 11/4/25 UCB CS 61 CL F 09 Lec 12 5
Or really Memory Processor • Memory systems are extremely sophisticated • Parallelism, caching, controllers, protocols, … 11/4/25 UCB CS 61 CL F 09 Lec 12 6
+ IR_mem IR_ex IR_wb Dmem °°° A B Ci IR PC imem Pipeline design: I-miss handling • Insert a no-op “bubble” till i-fetch completes 11/4/09 UCB CS 61 CL F 09 Lec 10 7
+ IR_mem IR_ex IR_wb Dmem °°° A B Ci IR PC imem Pipeline Design: D-miss • Stall entire pipeline behind mem stage for data miss penalty • Bubble the remainder (WB) 11/4/09 UCB CS 61 CL F 09 Lec 10 8
Performance “Iron Triangle” • Execution Time = Seconds / Program = Seconds X Cycles X Instructions Cycle Instruction Program = Cycle. Time X CPI X Inst. Count • What primarily determines… – Cycle Time? – Instruction Count? – CPI ? CPI Cycle Time 11/4/25 UCB CS 61 CL F 09 Lec 12 Inst. Count 9
Bringing Cache into the Picture • Recall MAT = Timehit + Pmiss * Penaltymiss • Timehit < Cycle Time • Penaltymiss = Pipeline Stalls/Bubbles during miss • Ideal CPI is CPI with perfect memory system • CPI = Ideal_CPI + Pmiss* Penaltymiss 11/4/25 UCB CS 61 CL F 09 Lec 12 10
Example • Instruction Mix: – 50% arith, 30% load/store, 20% jumps/branches • Pipeline hazards – Ideal CPI = 1. 2 • Cache behavior – 0. 2% instruction miss rate (99. 8% hit rate) – 3% data miss rate (97% hit rate) – 100 cycle miss penalty • Without Cache: CPI = 1. 2 + 100 + 0. 30 x 100 = 131. 2 – processor pipeline is 0. 7% utilized !!!! • Cache: CPI = 1. 2 + 1 x 0. 002 x 100 + 0. 30 x 0. 03 x 100 = 1. 2 + 0. 9 = 2. 3 on average ~half the time is spent waiting for mem. 11/4/25 UCB CS 61 CL F 09 Lec 12 11
Administration • Midterm II results – Max: 99 Mean: 75. 2 (without bonus) – Max: 105. 5 Mean 77 • HW 8 due 12/7 midnight • Project 4 due 12/9 midnight • Review Week – review in Tu/W lab + optional threads lab – review in lecture • Final Exam: Dec 15 12: 30 -3: 30 11/4/25 UCB CS 61 CL F 09 Lec 12 12
Virtual Memory • Each Program runs in its own Virtual Address Space (VAS) • Distinct from the Physical Address Space (PAS) of the machine • Hardware transparently maps the Virtual Address Spaces onto physical resources • Only a small fraction of the VAS’s in physical memory at any time! 11/4/25 UCB CS 61 CL F 09 Lec 12 13
Timesharing, Multi. Processing, Multitasking 11/4/25 UCB CS 61 CL F 09 Lec 12 14
Multiple Process Address Spaces in Mem 0000 Physical Memory 0000 00 FD 0000 FFFF 11/4/25 UCB CS 61 CL F 09 Lec 12 15
With Virtual Memory 00000000 Physical Memory 00 FD 0000 FFFFFFFF 11/4/25 UCB CS 61 CL F 09 Lec 12 16
A Processor Supporting Virtual Memory • Is able to access a Page Table to translate Virtual Page Number => Physical Frame • on EVERY memory reference • Page Table lives in memory • How many memory accesses per instruction? – Instruction Fetch VA Translation » PF = Mem[ PTbase + PC_page] – Fetch the Actual Instructions » IR = Mem[ PF + PC_offset] – Load/Store VA Translation » PF = Mem[ PTbase + (R[rs]+Sx)_page ] – Load/Store the actual location » R[rt] = Mem[ PF + (R[rs]+Sx)_offset ] • How many cache accesses? 11/4/25 UCB CS 61 CL F 09 Lec 12 17
TLB ? ? • Translation Lookaside Buffer is a specialized cache for the page table • It was invented (by Sir Maurice Wilkes) to make virtual memory possible • He then realized it could be used to make all memory accesses faster. • Should TLBs and caches be different? 11/4/25 UCB CS 61 CL F 09 Lec 12 18
What must happens in the processor on a Page Fault? • It could happen in instruction fetch, LW or SW • The translation fails • The actual page is out on disk – 10 ms @ 3 GHz => 30 Million cycles to access it! • We need to run a special program (The Operating System) to go and get it – allocate a frame in memory – read the page from disk » seek » transfer, … – update the page table • But we are in the middle of an instruction… 11/4/25 UCB CS 61 CL F 09 Lec 12 19
+ IR_mem IR_ex IR_wb Dmem °°° A B Ci IR PC imem Page Fault • Cannot just stall the pipeline • Must “trap” the current instruction • Put it aside and start executing other (OS) instructions 11/4/09 UCB CS 61 CL F 09 Lec 10 20
More Key Concepts • Exception: unprogrammed transfer of control • Interrupt – asynchronous – occurs between instructions – used for efficient I/O • Fault – synchronous – occurs within an instruction • Preserve state associated with trap in special registers – EPC + BADVad + Cause in MIPS • Modify PC register to be exception handler – PC : = trap. Handler. Addr 11/4/25 UCB CS 61 CL F 09 Lec 12 21
What information must be recorded on a page fault? • The PC of offending instruction • The offending address • other cause-related info 11/4/25 UCB CS 61 CL F 09 Lec 12 22
Page Fault in Action Physical Memory 07 0000 Disk page 0040 Page Table 0040 v: 07 PTB Regs 0040 => 07 TLB Processor PC e. PC 11/4/25 0040 0010 Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 23
Inst Fetch: VA 0040 xxxx => PA 07 xxxx Physical Memory 07 0000 Disk page 0040 Page Table 0040 v: 07 PTB Regs 0040 => 07 TLB Processor PC e. PC 11/4/25 0040 0010 Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 24
Inst Fetch: mem[07 0010] => IR Physical Memory 07 0000 Disk page 0040 Page Table 0040 v: 07 PTB Regs 0040 => 07 TLB Processor PC e. PC 11/4/25 0040 0010 lw $3 20($4) Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 25
Exec: EA = 0053 1000 + 20 Physical Memory 07 0000 Disk page 0040 Page Table 0040 v: 07 PTB Regs 0053 1000 Processor PC e. PC 11/4/25 0040 0010 0040 => 07 TLB lw $3 20($4) Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 26
Exec: VA 00531020 => ? ? ? TLB miss Physical Memory 07 0000 Disk page 0040 Page Table 0040 v: 07 PTB Regs 0053 1000 Processor PC e. PC 11/4/25 0040 0010 0040 => 07 TLB lw $3 20($4) Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 27
Exec: PT lookup(0053) => ? ? ? Fault Physical Memory 07 0000 Disk page 0040 Page Table 0053 v: 07 N: PTB Regs 0053 1000 Processor PC e. PC 11/4/25 0040 0010 0040 => 07 TLB lw $3 20($4) Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 28
Exec: Trap to OS Page Fault Handler Physical Memory 07 0000 Disk page 0040 Page Table 0053 v: 07 N: PTB Regs 0053 1000 Processor PC e. PC 11/4/25 00001 FF 00 0040 0010 0040 => 07 TLB lw $3 20($4) 0053 1020 Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 29
Fetch and execute OS instructions OS page Physical Memory 07 0000 Disk page 0040 Page Table 0053 v: 07 N: PTB Regs 0053 1000 Processor PC e. PC 11/4/25 00001 FF 00 0040 0010 0040 => 07 TLB j flt_hndlr 0053 1020 Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 30
Fetch and execute OS instructions OS page Physical Memory 07 0000 Disk page 0040 Page Table 0053 v: 07 N: PTB Regs 0053 1000 Processor 0040 => 07 TLB PC 000 YY xxxx 0040 0010 jxzyxzyxz e. PC 0040 0010 0053 1020 11/4/25 Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 31
Load page from Disk to Memory OS page Physical Memory 07 0000 Disk page 0040 page 0053 Page Table 0053 v: 07 N: PTB Regs 0053 1000 Processor PC e. PC 11/4/25 00001 FF 00 0040 0010 0040 => 07 TLB j flt_hndlr 0053 1020 Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 32
Update Page Table OS page Physical Memory 07 0000 Disk page 0040 14 0000 page 0053 Page Table 0053 v: 07 v: 14 PTB Regs 0053 1000 Processor PC e. PC 11/4/25 00001 FF 00 0040 0010 0040 => 07 TLB j flt_hndlr 0053 1020 Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 33
Return. From. Exception (RFE) OS page Physical Memory 07 0000 Disk page 0040 14 0000 page 0053 Page Table 0053 v: 07 v: 14 PTB Regs 0053 1000 Processor PC 0040 0010 e. PC 0040 0010 11/4/25 0040 => 07 TLB lw $3 20($4) Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 34
Exec: TLB Miss, PT lookup OS page Physical Memory 07 0000 Disk page 0040 14 0000 page 0053 Page Table 0053 v: 07 v: 14 PTB Regs Processor PC e. PC 11/4/25 0053 1000 0040 => 07 TLB 0053 => 07 0040 0010 lw $3 20($4) Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 35
Exec: Read physical address OS page Physical Memory 07 0000 Disk page 0040 14 0000 page 0053 Page Table 0053 v: 07 v: 14 PTB Regs Processor PC e. PC 11/4/25 0053 1000 0040 => 07 TLB 0053 => 07 0040 0010 lw $3 20($4) 432 Program Virtual Address Space IR bad. VA UCB CS 61 CL F 09 Lec 12 36
Paging the Page Table? • 264 byte virtual address space • 214 byte pages (16 k. B) • => 250 page table entries • Large address spaces are used sparsely 11/4/25 UCB CS 61 CL F 09 Lec 12 37
Summary • Caches are essential to performance • Virtual Address translation permits modern operating systems and applications • Requires caching • Also requires special processor hardware support • Also requires operating system support • Works as long as page faults are rare • Next Time: Andy lectures on “What’s an OS” 11/4/25 UCB CS 61 CL F 09 Lec 12 38
- Slides: 38