15 213 The course that gives CMU its
- Slides: 40
15 -213 “The course that gives CMU its Zip!” Virtual Memory Oct. 29, 2002 Topics n n n class 19. ppt Motivations for VM Address translation Accelerating translation with TLBs
Motivations for Virtual Memory Use Physical DRAM as a Cache for the Disk n Address space of a process can exceed physical memory size n Sum of address spaces of multiple processes can exceed physical memory Simplify Memory Management n Multiple processes resident in main memory. l Each process with its own address space n Only “active” code and data is actually in memory l Allocate more memory to process as needed. Provide Protection n One process can’t interfere with another. l because they operate in different address spaces. n User process cannot access privileged information l different sections of address spaces have different permissions. – 2– 15 -213, F’ 02
Motivation #1: DRAM a “Cache” for Disk Full address space is quite large: n 32 -bit addresses: ~4, 000, 000 (4 billion) bytes n 64 -bit addresses: ~16, 000, 000 (16 quintillion) bytes Disk storage is ~300 X cheaper than DRAM storage n n 80 GB of DRAM: ~ $33, 000 80 GB of disk: ~ $110 To access large amounts of data in a cost-effective manner, the bulk of the data must be stored on disk 1 GB: ~$200 80 GB: ~$110 4 MB: ~$500 SRAM – 3– DRAM Disk 15 -213, F’ 02
Levels in Memory Hierarchy cache CPU regs Register size: speed: $/Mbyte: line size: 32 B 1 ns 8 B 8 B C a c h e 32 B Cache 32 KB-4 MB 2 ns $125/MB 32 B virtual memory Memory 1024 MB 30 ns $0. 20/MB 4 KB disk Disk Memory 100 GB 8 ms $0. 001/MB larger, slower, cheaper – 4– 15 -213, F’ 02
DRAM vs. SRAM as a “Cache” DRAM vs. disk is more extreme than SRAM vs. DRAM n Access latencies: l DRAM ~10 X slower than SRAM l Disk ~100, 000 X slower than DRAM n Importance of exploiting spatial locality: l First byte is ~100, 000 X slower than successive bytes on disk » vs. ~4 X improvement for page-mode vs. regular accesses to DRAM n Bottom line: l Design decisions made for DRAM caches driven by enormous cost of misses SRAM – 5– DRAM Disk 15 -213, F’ 02
Impact of Properties on Design If DRAM was to be organized similar to an SRAM cache, how would we set the following design parameters? n Line size? l Large, since disk better at transferring large blocks n Associativity? l High, to mimimize miss rate n Write through or write back? l Write back, since can’t afford to perform small writes to disk What would the impact of these choices be on: n miss rate l Extremely low. << 1% n hit time l Must match cache/DRAM performance n miss latency l Very high. ~20 ms n – 6– tag storage overhead l Low, relative to block size 15 -213, F’ 02
Locating an Object in a “Cache” SRAM Cache n Tag stored with cache line n Maps from cache block to memory blocks l From cached to uncached form l Save a few bits by only storing tag n n No tag for block not in cache Hardware retrieves information l can quickly match against multiple tags Object Name X = X? Tag Data 0: D 243 1: X • • • J 17 • • • 105 N-1: – 7– “Cache” 15 -213, F’ 02
Locating an Object in “Cache” (cont. ) DRAM Cache n Each allocated page of virtual memory has entry in page table n Mapping from virtual pages to physical pages l From uncached form to cached form n Page table entry even if page not in memory l Specifies disk address l Only way to indicate where to find page n OS retrieves information Page Table “Cache” Location Data Object Name D: 0 0: 243 X J: On Disk 1: 17 • • • 105 X: – 8– • • • 1 N-1: 15 -213, F’ 02
A System with Physical Memory Only Examples: n most Cray machines, early PCs, nearly all embedded systems, etc. Memory Physical Addresses 0: 1: CPU N-1: n – 9– Addresses generated by the CPU correspond directly to bytes in physical memory 15 -213, F’ 02
A System with Virtual Memory Examples: n Memory workstations, servers, modern PCs, etc. Virtual Addresses Page Table 0: 1: Physical Addresses CPU P-1: N-1: Disk n Address Translation: Hardware converts virtual addresses to physical addresses via OS-managed lookup table (page table) – 10 – 15 -213, F’ 02
Page Faults (like “Cache Misses”) What if an object is on disk rather than in memory? n Page table entry indicates virtual address not in memory n OS exception handler invoked to move data from disk into memory l current process suspends, others can resume l OS has full control over placement, etc. Before fault Page Table Virtual Physical Addresses CPU Memory Page Table Virtual Addresses Physical Addresses CPU Disk – 11 – After fault Memory Disk 15 -213, F’ 02
Servicing a Page Fault Processor Signals Controller n Read block of length P starting at disk address X and store starting at memory address Y Read Occurs n n Direct Memory Access (DMA) Under control of I/O controller I / O Controller Signals Completion n n – 12 – Interrupt processor OS resumes suspended process (1) Initiate Block Read Processor Reg (3) Read Done Cache Memory-I/O bus (2) DMA Transfer Memory I/O controller disk Disk 15 -213, F’ 02
Motivation #2: Memory Management Multiple processes can reside in physical memory. How do we resolve address conflicts? n what if two processes access something at the same address? kernel virtual memory stack %esp Memory mapped region forshared libraries Linux/x 86 process memory image – 13 – memory invisible to user code runtime heap (via malloc) 0 the “brk” ptr uninitialized data (. bss) initialized data (. data) program text (. text) forbidden 15 -213, F’ 02
Solution: Separate Virt. Addr. Spaces n Virtual and physical address spaces divided into equal-sized blocks l blocks are called “pages” (both virtual and physical) n Each process has its own virtual address space l operating system controls how virtual pages as assigned to physical memory 0 Virtual Address Space for Process 1: Address Translation 0 VP 1 VP 2 PP 2 . . . N-1 PP 7 Virtual Address Space for Process 2: – 14 – Physical Address Space (DRAM) 0 VP 1 VP 2 PP 10 . . . N-1 (e. g. , read/only library code) M-1 15 -213, F’ 02
Contrast: Macintosh Memory Model MAC OS 1– 9 n Does not use traditional virtual memory P 1 Pointer Table Process P 1 Shared Address Space A B “Handles” P 2 Pointer Table C Process P 2 D E All program objects accessed through “handles” Indirect reference through pointer table n Objects stored in shared global address space n – 15 -213, F’ 02
Macintosh Memory Management Allocation / Deallocation n Similar to free-list management of malloc/free Compaction n Can move any object and just update the (unique) pointer in pointer table P 1 Pointer Table Shared Address Space B Process P 1 A “Handles” P 2 Pointer Table C Process P 2 D – 16 – E 15 -213, F’ 02
Mac vs. VM-Based Memory Mgmt Allocating, deallocating, and moving memory: n can be accomplished by both techniques Block sizes: n Mac: variable-sized l may be very small or very large n VM: fixed-size l size is equal to one page (4 KB on x 86 Linux systems) Allocating contiguous chunks of memory: n n Mac: contiguous allocation is required VM: can map contiguous range of virtual addresses to disjoint ranges of physical addresses Protection n – 17 – Mac: “wild write” by one process can corrupt another’s data 15 -213, F’ 02
MAC OS X “Modern” Operating System n Virtual memory with protection n Preemptive multitasking l Other versions of MAC OS require processes to voluntarily relinquish control Based on MACH OS n – 18 – Developed at CMU in late 1980’s 15 -213, F’ 02
Motivation #3: Protection Page table entry contains access rights information n hardware enforces this protection (trap into OS if violation occurs) Page Tables Memory Read? Write? VP 0: Yes No Process i: VP 1: Yes VP 2: No • • • Yes PP 4 No XXXXXXX • • • Read? Write? VP 0: Yes Process j: – 19 – Physical Addr PP 9 • • • Physical Addr PP 6 VP 1: Yes No PP 9 VP 2: No XXXXXXX No • • • 0: 1: N-1: • • • 15 -213, F’ 02
VM Address Translation Virtual Address Space n V = {0, 1, …, N– 1} Physical Address Space n n P = {0, 1, …, M– 1} M<N Address Translation n n MAP: V P U { } For virtual address a: l MAP(a) = a’ if data at virtual address a at physical address a’ in P l MAP(a) = if data at virtual address a not in physical memory » Either invalid or stored on disk – 20 – 15 -213, F’ 02
VM Address Translation: Hit Processor a virtual address – 21 – Hardware Addr Trans Mechanism Main Memory a' part of the physical address on-chip memory mgmt unit (MMU) 15 -213, F’ 02
VM Address Translation: Miss page fault handler Processor a virtual address – 22 – Hardware Addr Trans Mechanism Main Memory Secondary memory a' part of the physical address on-chip memory mgmt unit (MMU) OS performs this transfer (only if miss) 15 -213, F’ 02
VM Address Translation Parameters n P = 2 p = page size (bytes). n N = 2 n = Virtual address limit M = 2 m = Physical address limit n n– 1 p p– 1 virtual page number 0 virtual address page offset address translation m– 1 p p– 1 physical page number page offset 0 physical address Page offset bits don’t change as a result of translation – 23 – 15 -213, F’ 02
Page Tables Virtual Page Number Memory resident page table (physical page Valid or disk address) 1 1 0 1 0 1 – 24 – Physical Memory Disk Storage (swap file or regular file system file) 15 -213, F’ 02
Address Translation via Page Table page table base register VPN acts as table index if valid=0 then page not in memory virtual address n– 1 p p– 1 virtual page number (VPN) page offset 0 valid access physical page number (PPN) m– 1 p p– 1 physical page number (PPN) page offset 0 physical address – 25 – 15 -213, F’ 02
Page Table Operation Translation n Separate (set of) page table(s) per process n VPN forms index into page table (points to a page table entry) – 26 – 15 -213, F’ 02
Page Table Operation Computing Physical Address n Page Table Entry (PTE) provides information about page l if (valid bit = 1) then the page is in memory. » Use physical page number (PPN) to construct address l if (valid bit = 0) then the page is on disk » Page fault – 27 – 15 -213, F’ 02
Page Table Operation Checking Protection n Access rights field indicate allowable access l e. g. , read-only, read-write, execute-only l typically support multiple protection modes (e. g. , kernel vs. user) n – 28 – Protection violation fault if user doesn’t have necessary permission 15 -213, F’ 02
Integrating VM and Cache VA CPU miss PA Translation Cache Main Memory hit data Most Caches “Physically Addressed” n Accessed by physical addresses n Allows multiple processes to have blocks in cache at same time Allows multiple processes to share pages Cache doesn’t need to be concerned with protection issues n n l Access rights checked as part of address translation Perform Address Translation Before Cache Lookup n n – 29 – But this could involve a memory access itself (of the PTE) Of course, page table entries can also become cached 15 -213, F’ 02
Speeding up Translation with a TLB “Translation Lookaside Buffer” (TLB) n Small hardware cache in MMU n Maps virtual page numbers to physical page numbers Contains complete page table entries for small number of pages n hit PA VA CPU TLB Lookup miss Cache Main Memory hit Translation data – 30 – 15 -213, F’ 02
Address Translation with a TLB n– 1 p p– 1 0 virtual page number page offset valid . virtual address tag physical page number . TLB . = TLB hit physical address tag index valid tag byte offset data Cache = cache hit – 31 – data 15 -213, F’ 02
Simple Memory System Example Addressing n 14 -bit virtual addresses n 12 -bit physical address Page size = 64 bytes n 13 12 11 10 9 8 7 6 5 4 VPN – 32 – 10 2 1 0 VPO (Virtual Page Offset) (Virtual Page Number) 11 3 9 8 7 6 5 4 3 2 1 PPN PPO (Physical Page Number) (Physical Page Offset) 0 15 -213, F’ 02
Simple Memory System Page Table n – 33 – Only show first 16 entries VPN PPN Valid 00 28 1 08 13 1 01 – 0 09 17 1 02 33 1 0 A 09 1 03 02 1 0 B – 0 04 – 0 0 C – 0 05 16 1 0 D 2 D 1 06 – 0 0 E 11 1 07 – 0 0 F 0 D 1 15 -213, F’ 02
Simple Memory System TLB n 16 entries n 4 -way associative TLBT 13 12 11 10 TLBI 9 8 7 6 5 4 3 VPN 2 1 0 VPO Set Tag PPN Valid 0 03 – 0 09 0 D 1 00 – 0 07 02 1 1 03 2 D 1 02 – 0 04 – 0 0 A – 0 2 02 – 0 08 – 0 06 – 0 03 – 0 3 07 – 0 03 0 D 1 0 A 34 1 02 – 0 – 34 – 15 -213, F’ 02
Simple Memory System Cache n 16 lines n 4 -byte line size Direct mapped n CI CT 11 10 9 8 7 6 5 4 PPN CO 3 2 1 0 PPO Idx Tag Valid B 0 B 1 B 2 B 3 0 19 1 99 11 23 11 8 24 1 3 A 00 51 89 1 15 0 – – 9 2 D 0 – – 2 1 B 1 00 02 04 08 A 2 D 1 93 15 DA 3 B 3 36 0 – – B 0 B 0 – – 4 32 1 43 6 D 8 F 09 C 12 0 – – 5 0 D 1 36 72 F 0 1 D D 16 1 04 96 34 15 6 31 0 – – E 13 1 83 77 1 B D 3 7 – 35 – 16 1 11 C 2 DF 03 F 14 0 – – 15 -213, F’ 02
Address Translation Example #1 Virtual Address 0 x 03 D 4 TLBT 13 12 11 10 TLBI 9 8 7 6 5 4 3 VPN TLB Hit? __ Physical Address 9 8 7 6 5 4 PPN Offset ___ – 36 – CI___ 0 Page Fault? __ CI CT 10 1 VPO VPN ___ TLBI ___ TLBT ____ PPN: ____ 11 2 CT ____ Hit? __ CO 3 2 1 0 PPO Byte: ____ 15 -213, F’ 02
Address Translation Example #2 Virtual Address 0 x 0 B 8 F TLBT 13 12 11 10 TLBI 9 8 7 6 5 4 3 VPN TLB Hit? __ Physical Address 9 8 7 6 5 4 PPN Offset ___ – 37 – CI___ 0 Page Fault? __ CI CT 10 1 VPO VPN ___ TLBI ___ TLBT ____ PPN: ____ 11 2 CT ____ Hit? __ CO 3 2 1 0 PPO Byte: ____ 15 -213, F’ 02
Address Translation Example #3 Virtual Address 0 x 0040 TLBT 13 12 11 10 TLBI 9 8 7 6 5 4 3 VPN TLB Hit? __ Physical Address 9 8 7 6 5 4 PPN Offset ___ – 38 – CI___ 0 Page Fault? __ CI CT 10 1 VPO VPN ___ TLBI ___ TLBT ____ PPN: ____ 11 2 CT ____ Hit? __ CO 3 2 1 0 PPO Byte: ____ 15 -213, F’ 02
Multi-Level Page Tables Level 2 Tables Given: n 4 KB (212) page size n 32 -bit address space 4 -byte PTE n Problem: n Level 1 Table Would need a 4 MB page table! l 220 *4 bytes Common solution n n multi-level page tables e. g. , 2 -level table (P 6) . . . l Level 1 table: 1024 entries, each of – 39 – which points to a Level 2 page table. l Level 2 table: 1024 entries, each of which points to a page 15 -213, F’ 02
Main Themes Programmer’s View n Large “flat” address space l Can allocate large blocks of contiguous addresses n Processor “owns” machine l Has private address space l Unaffected by behavior of other processes System View n User virtual address space created by mapping to set of pages l Need not be contiguous l Allocated dynamically l Enforce protection during address translation n OS manages many processes simultaneously l Continually switching among processes l Especially when one must wait for resource – 40 – » E. g. , disk I/O to handle page fault 15 -213, F’ 02
- Cmu 15-213
- 15-213 cmu
- 15-213 cmu
- 18-213 cmu
- 18-213 cmu
- Cmu 213
- Cmu 213
- 18-213 cmu
- Phân độ lown ngoại tâm thu
- Walmart thất bại ở nhật
- Sau thất bại ở hồ điển triệt
- Block av độ 2
- Tìm vết của mặt phẳng
- Thể thơ truyền thống
- Tôn thất thuyết là ai
- Con hãy đưa tay khi thấy người vấp ngã
- Thơ thất ngôn tứ tuyệt đường luật
- Gây tê cơ vuông thắt lưng
- Cmu machine learning
- The skeleton gives the body its basic shape
- Cut brick lengthwise
- Course number and title
- Chaine parallèle muscle
- Mis 213 uncw
- Ceng 213
- Vlsi
- 5 state process model
- Sbi 213
- Outline 213
- Ct-213
- 15 213
- 213 to 1 significant figure
- Ct 213
- Ee 213
- 213 table
- Cs 213 northwestern
- Outline 213
- Administration definition
- Zva-213-s+
- 15 213
- Ct 213