CS 15 447 Computer Architecture Lecture 23 Virtual

  • Slides: 31
Download presentation
CS 15 -447: Computer Architecture Lecture 23 Virtual Memory (2) November 21, 2007 Karem

CS 15 -447: Computer Architecture Lecture 23 Virtual Memory (2) November 21, 2007 Karem A. Sakallah 15 -447 Computer Architecture Fall 2007 ©

Last Lecture • Virtual memory lets the programmer “see” a memory array larger than

Last Lecture • Virtual memory lets the programmer “see” a memory array larger than the DRAM available on a particular computer system. • Virtual memory enables multiple programs to share the physical memory without: – Knowing other programs exist (transparency). – Worrying about one program modifying the data contents of another (protection). 2 15 -447 Computer Architecture Fall 2007 ©

Other VM Functions • Page data location – Physical memory, disk, uninitialized data •

Other VM Functions • Page data location – Physical memory, disk, uninitialized data • Access permissions – Read only pages for instructions • Gathering access information – Identifying dirty pages by tracking stores – Identifying accesses to help determine LRU candidate 3 15 -447 Computer Architecture Fall 2007 ©

Page Replacement Strategies • Page table indirection enables a fully associative mapping between virtual

Page Replacement Strategies • Page table indirection enables a fully associative mapping between virtual and physical pages. • How do we implement LRU? – True LRU is expensive, but LRU is a heuristic anyway, so approximating LRU is fine – Reference bit on page, cleared occasionally by operating system. Then pick any “unreferenced” page to evict. 4 15 -447 Computer Architecture Fall 2007 ©

Performance of Virtual Memory • We must access physical memory to access the page

Performance of Virtual Memory • We must access physical memory to access the page table to make the translation from a virtual address to a physical one • Then we access physical memory again to get (or store) the data • A load instruction performs at least 2 memory reads • A store instruction performs at least 1 read and then a write. 5 15 -447 Computer Architecture Fall 2007 ©

Translation Lookaside Buffer • We fix this performance problem by avoiding memory in the

Translation Lookaside Buffer • We fix this performance problem by avoiding memory in the translation from virtual to physical pages. • We buffer the common translations in a translation lookaside buffer (TLB) 6 15 -447 Computer Architecture Fall 2007 ©

TLB Virtual page Pg offset v tag Physical page 7 15 -447 Computer Architecture

TLB Virtual page Pg offset v tag Physical page 7 15 -447 Computer Architecture Fall 2007 ©

Where is the TLB Lookup? • We put the TLB lookup in the pipeline

Where is the TLB Lookup? • We put the TLB lookup in the pipeline after the virtual address is calculated and before the memory reference is performed. – This may be before or during the data cache access. – Without a TLB we need to perform the translation during the memory stage of the pipeline. 8 15 -447 Computer Architecture Fall 2007 ©

Placing Caches in a VM System • VM systems give us two different addresses:

Placing Caches in a VM System • VM systems give us two different addresses: virtual and physical • Which address should we use to access the data cache? – Virtual address (before VM translation) • Faster access? More complex? – Physical address (after VM translations) • Delayed access? 9 15 -447 Computer Architecture Fall 2007 ©

Physically-Addressed Caches • Perform TLB lookup before cache tag comparison. – Use bits from

Physically-Addressed Caches • Perform TLB lookup before cache tag comparison. – Use bits from physical address to index set – Use bits from physical address to compare tag • Slower access? – Tag lookup takes place after the TLB lookup. • Simplifies some VM management – When switching processes, TLB must be invalidated, but cache OK to stay as is. 10 15 -447 Computer Architecture Fall 2007 ©

Picture of Physical Caches Virtual address Virtual page Page offset tag tag PPN PPN

Picture of Physical Caches Virtual address Virtual page Page offset tag tag PPN PPN Cache Page offset index Block offset Set 0 tag Set 1 tag Set 2 tag Tag cmp 11 15 -447 Computer Architecture Fall 2007 ©

Virtually-Addressed Caches • Perform the TLB lookup at the same time as the cache

Virtually-Addressed Caches • Perform the TLB lookup at the same time as the cache tag compare. – Uses bits from the virtual address to index the cache set – Uses bits from the virtual address for tag match. • Problems: – Aliasing: Two processes may refer to the same physical location with different virtual addresses. – When switching processes, TLB must be invalidated, and dirty cache blocks must be written back to memory. 12 15 -447 Computer Architecture Fall 2007 ©

Picture of Virtual Caches Virtual address tag Tag cmp index Block offset Set 0

Picture of Virtual Caches Virtual address tag Tag cmp index Block offset Set 0 tag Set 1 tag Set 2 tag • TLB is accessed in parallel with cache lookup • Physical address is used to access main memory in case of a 13 cache miss. 15 -447 Computer Architecture Fall 2007 ©

OS Support for Virtual Memory • It must be able to modify the page

OS Support for Virtual Memory • It must be able to modify the page table register, update page table values, etc. – To enable the OS to do this, AND not the user program, we have different execution modes for a process – one which has executive (or supervisor or kernel level) permissions and one that has user level permissions. 14 15 -447 Computer Architecture Fall 2007 ©

Extended Example: Loading a Program into Memory Disk Pages Memory 0 1 2 3

Extended Example: Loading a Program into Memory Disk Pages Memory 0 1 2 3 1000 1001 1002 1003 2 entry TLB Page Table text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 0 1 2 3 4 5 6 7 15 -447 Computer Architecture Physical Refs 15 Fall 2007 ©

Additional Information • Page size = 4 KB • Page table entry size =

Additional Information • Page size = 4 KB • Page table entry size = 4 B • Page table register points to physical address 0000 16 15 -447 Computer Architecture Fall 2007 ©

Step 1: Read Executable Header and Initialize Page Table Disk Pages Memory 0 1

Step 1: Read Executable Header and Initialize Page Table Disk Pages Memory 0 1 2 3 reserved 1000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 2 entry TLB Page Table Physical Refs 0 D 1000 ro ro 1 D 1001 2 D 1002 3 no map 4 no map 5 no map 6 no map 17 7 no map 15 -447 Computer Architecture Fall 2007 ©

Step 2: Load PC from Header and Start Execution Disk Pages Memory 0 1

Step 2: Load PC from Header and Start Execution Disk Pages Memory 0 1 2 3 reserved 1000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 2 entry TLB MISS! Page Table Physical Refs 0 D 1000 ro ro 1 D 1001 2 D 1002 3 no map 4 no map 5 no map 6 no map 18 7 no map 15 -447 Computer Architecture Fall 2007 ©

Fetching instr 0000 Disk Pages Memory 0 1 2 3 reserved 1000 1001 1002

Fetching instr 0000 Disk Pages Memory 0 1 2 3 reserved 1000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 2 entry TLB Page Table Physical Refs 0 D 1000 ro 0000 ro 1 D 1001 Page fault 2 D 1002 3 no map 4 no map 5 no map 6 no map 19 7 no map 15 -447 Computer Architecture Fall 2007 ©

Fetching instr 0000 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved

Fetching instr 0000 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text 1 0000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 M 1 ro Page Table Refs M 1 ro Physical 0 0000 ro 1 D 1001 Page fault 2 D 1002 3 no map 4 no map 5 no map 6 no map 20 7 no map 15 -447 Computer Architecture Fall 2007 ©

Fetching instr 0000 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved

Fetching instr 0000 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text 1 0000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 M 1 ro Page Table Refs M 1 ro Physical 0 0000 ro 1 D 1001 Page fault 1000 2 D 1002 3 no map 4 no map 5 no map 6 no map 21 7 no map 15 -447 Computer Architecture Fall 2007 ©

Fetching instr 0004 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved

Fetching instr 0004 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text 1 0000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 M 1 ro HIT! Page Table Refs M 1 ro Physical 0 0000 ro 1 D 1001 Page fault 1000 2 D 1002 1004 3 no map 4 no map 5 no map 6 no map 22 7 no map 15 -447 Computer Architecture Fall 2007 ©

Reference 7 FFC Disk Pages Memory 2 entry TLB 0 1 2 3 reserved

Reference 7 FFC Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text 1 0000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 M 1 ro MISS! Page Table Refs M 1 ro Physical 0 0000 ro 1 D 1001 Page fault 1000 2 D 1002 1004 3 no map 4 no map 5 no map 6 no map 23 7 no map 15 -447 Computer Architecture Fall 2007 ©

Reference 7 FFC Disk Pages Memory 2 entry TLB 0 1 2 3 reserved

Reference 7 FFC Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text 1 0000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 M 1 ro Page Table Refs M 1 ro Physical 0 0000 ro 1 D 1001 Page fault 1000 2 D 1002 1004 3 no map No map page fault 4 no map 5 no map 6 no map 24 7 no map 15 -447 Computer Architecture Fall 2007 ©

Reference 7 FFC Disk Pages Memory 2 entry TLB 0 1 2 3 reserved

Reference 7 FFC Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text 1 Set to 0 s 0000 7000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 M 1 M 2 ro rw Page Table Refs M 1 ro Physical 0 0000 ro 1 D 1001 Page fault 1000 2 D 1002 1004 3 no map No map page fault 4 no map 2 FFC 5 no map 6 no map 25 M 2 7 15 -447 Computer Architecture Fall 2007 ©

Fetching instr 0008 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved

Fetching instr 0008 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text 1 Set to 0 s 0000 7000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 M 1 M 2 ro HIT! rw Page Table Refs M 1 ro Physical 0 0000 ro 1 D 1001 Page fault 1000 2 D 1002 1004 3 no map No map page fault 4 no map 2 FFC 5 no map 1008 6 no map 26 M 2 7 15 -447 Computer Architecture Fall 2007 ©

Reference 2134 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text

Reference 2134 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text 1 Set to 0 s 0000 7000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 M 1 M 2 ro MISS! rw Page Table Refs M 1 ro Physical 0 0000 ro 1 D 1001 Page fault 1000 2 D 1002 1004 3 no map No map page fault 4 no map 2 FFC 5 no map 1008 6 no map 27 M 2 7 15 -447 Computer Architecture Fall 2007 ©

Reference 2134 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text

Reference 2134 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text 1 Set to 0 s 0000 7000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 M 1 M 2 ro rw Page Table Refs M 1 ro Physical 0 0000 ro 1 D 1001 Page fault 1000 2 D 1002 1004 3 no map No map page fault 4 no map 2 FFC 5 no map 1008 Page fault 6 no map 28 M 2 7 15 -447 Computer Architecture Fall 2007 ©

Reference 2134 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text

Reference 2134 Disk Pages Memory 2 entry TLB 0 1 2 3 reserved text 1 Set to 0 s Istatic 0000 2000 1001 1002 1003 text 1 text 2 Istatic References 0000 0004 7 FFC 0008 2134 M 1 M 3 ro rw Page Table Refs M 1 ro Physical 0 0000 ro 1 D 1001 Page fault 1000 M 3 2 1004 3 no map No map page fault 4 no map 2 FFC 5 no map 1008 Page fault 6 no map 29 3134 M 2 7 15 -447 Computer Architecture Fall 2007 ©

Multiple Processes • Virtual cache support for multiple processes: – Flush the cache between

Multiple Processes • Virtual cache support for multiple processes: – Flush the cache between each context switch. – Use process. ID (a unique number for each processes given by the operating system) as part of the tag 30 15 -447 Computer Architecture Fall 2007 ©

Multiple Processors • Can run two programs at the same time – Each processor

Multiple Processors • Can run two programs at the same time – Each processor has its own cache. Why? • May or may not share data – Sharing code is not a problem (read only) • Example: shared libraries, DDLs – Sharing data (read/write) is a problem • What if it is in one processors cache? – Solution: Snoopy caches 31 15 -447 Computer Architecture Fall 2007 ©