Virtual Memory video 1 Alan L Cox alcrice
- Slides: 55
Virtual Memory (video #1) Alan L. Cox alc@rice. edu Some slides adapted from CMU 15. 213 slides
Objectives Be able to explain the rationale for virtual memory (VM) Be able to describe the functionality provided by VM Be able to translate virtual addresses to physical addresses Be able to explain how VM benefits fork() and execve() Cox Virtual Memory 2
Virtual Memory Programs refer to memory using virtual memory addresses 00∙∙∙∙∙∙ 0 w movl (%rcx), %eax w Conceptually very large array of bytes w Each byte has its own address w Operating system provides address space private to particular “process” Compiler and run-time system allocate VM w Where different program objects should be stored w All allocation within single virtual address space FF∙∙∙∙∙∙F Cox Virtual Memory 3
Problem 1: How Does Everything Fit? 64 -bit addresses: 16 Exabyte (16 billion GB!) Physical main memory: Tens or Hundreds of Gigabytes ? And there are many processes …. Cox Virtual Memory 4
Problem 2: Memory Management Physical main memory Process 1 Process 2 Process 3 … Process n Cox x stack heap . text. data What goes where? … Virtual Memory 5
Problem 3: How To Protect? Physical main memory Process i Process j Problem 4: How To Share? Physical main memory Process i Process j Cox Virtual Memory 6
Solution: Indirection “All problems in computer science can be solved by another level of indirection. . . Except for the problem of too many layers of indirection. ” – David Wheeler Virtual memory Process 1 Physical memory mapping Virtual memory Process n Mapping solves the previous problems Each process gets its own private memory space Cox Virtual Memory 7
Address Spaces Virtual address space: Set of N = 2 n virtual addresses {0, 1, 2, 3, …, N-1} Physical address space: Set of M = 2 m physical addresses {0, 1, 2, 3, …, M-1} Clear distinction between data (bytes) and their attributes (addresses) Each data object can have multiple addresses Every byte in main memory: one physical address, one (or more) virtual addresses Cox Virtual Memory 8
A System Using Physical Addressing CPU Physical address (PA) . . . Main memory 0: 1: 2: 3: 4: 5: 6: 7: 8: M-1: Data word Used in “simple” systems like embedded microcontrollers in devices like cars, elevators, and digital picture frames Cox Virtual Memory 9
A System Using Virtual Addressing CPU Chip CPU Virtual address (VA) MMU Physical address (PA) . . . Main memory 0: 1: 2: 3: 4: 5: 6: 7: 8: M-1: Data word Used in all modern servers, desktops, laptops, tablets, and cell phones Cox Virtual Memory 10
Why Virtual Memory (VM)? Efficient use of limited main memory (RAM) w Use RAM as a cache for parts of a virtual address space • some non-cached parts stored on disk • some (unallocated) non-cached parts stored nowhere w Keep only active areas of virtual address space in RAM • transfer data back and forth to disk as needed w Share immutable code and data Simplifies memory management for programmers w Each process gets a full private address space Isolates address spaces w One process can’t interfere with another’s memory • because they operate in different address spaces w User process cannot access privileged information • different sections of address spaces have different permissions Cox Virtual Memory 11
VM as a Tool for Caching Virtual memory: array of N = 2 n contiguous bytes § think of the array (allocated part) as being stored on disk Physical main memory (DRAM) = cache for allocated virtual memory Basic unit: page; size = 2 p Virtual memory VP 0 Unallocated Cached VP 1 Disk VP 2 n-p-1 Uncached Unallocated Cached Uncached 0 0 Empty PP 0 PP 1 Empty 2 m-1 PP 2 m-p-1 2 n-1 Virtual pages (VP's) stored on disk Cox Physical memory Virtual Memory Physical pages (PP's) cached in DRAM 12
An Example Memory Hierarchy Not drawn to scale L 1/L 2 cache: 64 B blocks KBs L 1 I-cache CPU Reg Throughput: 16 B/cycle Latency: 3 cycles L 1 D-cache MBs GBs L 2 unified cache Main Memory 8 B/cycle 14 cycles 2 B/cycle 100 cycles TBs 1 B/30 cycles millions Disk Miss penalty (latency): 30 x Miss penalty (latency): 10, 000 x Cox Virtual Memory 13
DRAM Cache Organization DRAM cache organization driven by the enormous miss penalty w DRAM is about 10 x slower than the SRAM used to implement CPU caches w A solid-state disk is about 100 x slower than DRAM and a mechanical disk is about 10, 000 x slower than DRAM • For first byte, faster for next byte Consequences w Large page size: at least 4 -8 KB, sometimes 2 -4 MB w Fully associative • Any VP can be placed in any PP w Highly sophisticated replacement algorithms • Too complicated and open-ended to be implemented in hardware w Write-back rather than write-through Cox Virtual Memory 14
A System Using Virtual Addressing CPU Chip CPU Virtual address (VA) MMU Physical address (PA) . . . Main memory 0: 1: 2: 3: 4: 5: 6: 7: 8: M-1: Data word Used in all modern servers, laptops, and cell phones Cox Virtual Memory 15
Address Translation: Page Tables A page table is an array of page table entries (PTEs) that maps virtual pages to physical pages. Here: 8 VPs w Per-process kernel data structure in DRAM Physical memory (DRAM) Physical page number or Valid disk address PTE 0 0 null 1 1 0 0 PTE 7 1 VP 2 VP 7 VP 4 PP 0 PP 3 Virtual memory (disk) null VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 Cox Virtual Memory VP 7 16
Address Translation With a Page Table Virtual address Page table base register (PTBR) Page table address for process Virtual page number (VPN) Virtual page offset (VPO) Page table Valid Physical page number (PPN) Valid bit = 0: page not in memory (page fault) Physical page number (PPN) Physical page offset (PPO) Physical address Cox Virtual Memory 17
Page Hit Page hit: reference to VM word that is in physical memory Virtual address Physical page number or Valid disk address PTE 0 0 null 1 1 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 4 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 Cox Virtual Memory VP 7 18
Page Miss Page miss: reference to VM word that is not in physical memory Virtual address Physical page number or Valid disk address PTE 0 0 null 1 1 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 4 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 Cox Virtual Memory VP 7 19
Handling Page Fault Page miss causes page fault (an exception) Virtual address Physical page number or Valid disk address PTE 0 0 null 1 1 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 4 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 Cox Virtual Memory VP 7 20
Handling Page Fault Page miss causes page fault (an exception) Page fault handler selects a victim to be evicted (here VP 4) Virtual address Physical page number or Valid disk address PTE 0 0 null 1 1 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 4 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 Cox Virtual Memory VP 7 21
Handling Page Fault Page miss causes page fault (an exception) Page fault handler selects a victim to be evicted (here VP 4) Virtual address Physical page number or Valid disk address PTE 0 0 null 1 1 1 0 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 3 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 Cox Virtual Memory VP 7 22
Handling Page Fault Page miss causes page fault (an exception) Page fault handler selects a victim to be evicted (here VP 4) Offending instruction is restarted: page hit! Virtual address Physical page number or Valid disk address PTE 0 0 null 1 1 1 0 0 0 PTE 7 1 null Physical memory (DRAM) VP 1 VP 2 VP 7 VP 3 PP 0 PP 3 Virtual memory (disk) VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 Cox Virtual Memory VP 7 23
Why does it work? Locality Virtual memory works because of locality At any point in time, programs tend to access a set of active virtual pages called the working set w Programs with better temporal locality will have smaller working sets If (working set size < main memory size) w Good performance for one process after initial compulsory misses If ( SUM(working set sizes) > main memory size ) w Thrashing: Performance meltdown where pages are swapped (copied) in and out continuously Cox Virtual Memory 24
VM as a Tool for Memory Management Memory allocation w Each virtual page can be mapped to any physical page w A virtual page can be stored in different physical pages at different times Sharing code and data among processes w Map virtual pages to the same physical page (here: PP 6) Virtual Address Space for Process 1: 0 VP 1 VP 2 Address translation 0 PP 2 . . . Physical Address Space (DRAM) N-1 PP 6 Virtual Address Space for Process 2: Cox 0 PP 8 VP 1 VP 2 . . . N-1 (e. g. , read-only library code) Virtual Memory M-1 25
Simplifying Linking and Loading Linking Kernel virtual memory w Each program has similar virtual address space User stack (created at runtime) w Code, shared libraries, and stack can start at the same address in each process Memory invisible to user code %rsp (stack pointer) Memory-mapped region for shared libraries Loading w execve() allocates virtual pages for. text and. data sections = creates PTEs marked as invalid w Pages within the. text and. data sections are loaded on demand by the virtual memory system Cox Virtual Memory Run-time heap (created by malloc) Read/write segment (. data, . bss) Read-only segment (. init, . text, . rodata) Loaded from the executable file Unused 26
VM as a Tool for Memory Protection Extend PTEs with permission bits Page fault handler checks these before remapping w If violated, send process SIGSEGV (segmentation fault) Process i: kernel/supervisor mode needed SUP VP 0: VP 1: VP 2: No No Yes READ WRITE Yes Yes No Yes • • • Address PP 6 PP 4 PP 2 Physical Address Space PP 2 PP 4 PP 6 Process j: Cox SUP VP 0: VP 1: VP 2: No Yes No READ WRITE Yes Yes No Yes Address PP 9 PP 6 PP 11 Virtual Memory PP 8 PP 9 PP 11 27
Virtual Memory (video #2 A) Alan L. Cox alc@rice. edu Some slides adapted from CMU 15. 213 slides
Address Translation: Page Hit 2 PTEA CPU Chip CPU 1 VA PTE MMU 3 PA Cache/ Memory 4 Data 5 1) Processor sends virtual address to MMU 2 -3) MMU fetches PTE from page table in memory 4) MMU sends physical address to cache/memory 5) Cache/memory sends data word to processor Cox Virtual Memory 29
Address Translation: Page Fault Exception 4 2 PTEA CPU Chip CPU 1 VA 7 Page fault handler MMU PTE 3 Victim page Cache/ Memory 5 Disk New page 6 1) Processor sends virtual address to MMU 2 -3) MMU fetches PTE from page table in memory 4) Valid bit is zero, so MMU triggers page fault exception 5) Handler identifies victim (and, if dirty, pages it out to disk) 6) Handler pages in new page and updates PTE in memory 7) Handler returns to original process, restarting faulting instruction Cox Virtual Memory 30
Page Tables are often Multi-level in reality; accessing them is slow Level 2 Tables Given: w 4 KB (212) page size w 48 -bit address space w 8 -byte PTE Problem: w Would need a 512 GB page table! • 248 * 2 -12 * 23 = 239 bytes Hardware solution: • Level 2 table: each PTE points to a physical page . . . w Multi-level page tables w Example: 2 -level page table • Level 1 table: each PTE points to a level 2 page table Level 1 Table . . . Complementary software: w Level 1 table stays in memory w Level 2 tables paged in and out like other data Cox Virtual Memory 31
A Two-Level Page Table Hierarchy Level 1 page table Level 2 page tables Virtual memory VP 0 PTE 1 . . . PTE 2 (null) PTE 1023 PTE 3 (null) PTE 0 PTE 5 (null) . . . PTE 6 (null) PTE 1023 VP 1024 2 K allocated VM pages for code and data . . . 6 K unallocated VM pages 1023 null PTEs PTE 1023 Virtual Memory 1023 unallocated pages VP 9215 . . . Cox VP 1023 Gap PTE 7 (null) (1 K - 9) null PTEs . . . VP 2047 PTE 4 (null) PTE 8 0 1023 unallocated pages 1 allocated VM page for the stack 32
Translating with a k-level Page Table Virtual Address n-1 p-1 VPN 2 Level 2 page table Level 1 page table . . . VPN k. . . 0 VPO Level k page table PPN m-1 p-1 PPN 0 PPO Physical Address Cox Virtual Memory 33
Speeding up Translation with a TLB Page table entries (PTEs) are cached in L 1 like any other memory word w PTEs may be evicted by other data references w PTE hit still incurs memory access delay Solution: Translation Lookaside Buffer (TLB) w Small hardware cache in MMU w Maps virtual page numbers to physical page numbers w Contains complete page table entries for small number of pages Cox Virtual Memory 34
Without TLB 2 PTEA CPU Chip CPU 1 VA PTE MMU 3 PA Cache/ Memory 4 Data 5 1) Processor sends virtual address to MMU 2 -3) MMU fetches PTE from page table in memory 4) MMU sends physical address to cache/memory 5) Cache/memory sends data word to processor Cox Virtual Memory 35
TLB Hit CPU Chip CPU TLB 2 PTE VPN 3 1 VA MMU PA 4 Cache/ Memory Data 5 A TLB hit eliminates a memory access Cox Virtual Memory 36
TLB Miss Might need multiple iterations if a multi-level page table is used CPU Chip TLB 2 4 PTE VPN CPU 1 VA MMU 3 PTEA PA Cache/ Memory 5 Data 6 A TLB miss incurs an add’l memory access (the PTE) Fortunately, TLB misses are rare Cox Virtual Memory 37
Virtual Memory (video #2 B) Alan L. Cox alc@rice. edu Some slides adapted from CMU 15. 213 slides
Simple Memory System Example Addressing w 14 -bit virtual addresses w 12 -bit physical address w Page size = 64 bytes 13 12 11 9 8 7 6 5 4 3 2 1 VPN VPO Virtual Page Number Virtual Page Offset 11 Cox 10 10 9 8 7 6 5 4 3 2 1 PPN PPO Physical Page Number Physical Page Offset Virtual Memory 0 0 39
Simple Memory System Page Table Only show first 16 entries (out of 256) Assume all other entries are invalid Cox VPN PPN Valid 00 28 1 08 13 1 01 – 0 09 17 1 02 33 1 0 A 09 1 03 02 1 0 B – 0 04 – 0 0 C – 0 05 16 1 0 D 2 D 1 06 – 0 0 E 11 1 07 – 0 0 F 0 D 1 Virtual Memory 40
Simple Memory System TLB 16 entries 4 -way associative TLBT 13 12 11 10 TLBI 9 8 7 6 5 4 3 VPN 2 1 0 VPO Set Tag PPN Valid 0 03 – 0 09 0 D 1 00 – 0 07 02 1 1 03 2 D 1 02 – 0 04 – 0 0 A – 0 2 02 – 0 08 – 0 06 – 0 03 – 0 3 07 – 0 03 0 D 1 0 A 34 1 02 – 0 Cox Virtual Memory 41
Simple Memory System Cache 16 lines, 4 -byte block size Physically addressed Direct mapped CT 11 10 9 8 7 6 5 CI 4 CO 3 PPN 2 1 0 PPO Idx Tag Valid B 0 B 1 B 2 B 3 0 19 1 99 11 23 11 8 24 1 3 A 00 51 89 1 15 0 – – 9 2 D 0 – – 2 1 B 1 00 02 04 08 A 2 D 1 93 15 DA 3 B 3 36 0 – – B 0 B 0 – – 4 32 1 43 6 D 8 F 09 C 12 0 – – 5 0 D 1 36 72 F 0 1 D D 16 1 04 96 34 15 6 31 0 – – E 13 1 83 77 1 B D 3 7 16 1 11 C 2 DF 03 F 14 0 – – Cox Virtual Memory 42
Address Translation Example #1 Virtual Address: 0 x 03 D 4 TLBT TLBI 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 1 1 0 1 0 0 VPN VPO 3 TLBT ____ 0 x 03 TLB Hit? __ Y Page Fault? __ N PPN: 0 x 0 D VPN 0 x 0 F ___ TLBI ____ Physical Address CI CT CO 11 10 9 8 7 6 5 4 3 2 1 0 0 0 1 1 0 1 0 0 PPN PPO 0 CI___ 0 x 5 CT ____ 0 x 0 D Hit? __ Y Byte: ____ 0 x 36 CO ___ Cox Virtual Memory 43
Address Translation Example #2 Virtual Address: 0 x 0 B 8 F TLBT TLBI 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 1 1 VPN VPO 0 x 2 E TLBI ___ 2 TLBT ____ 0 x 0 B TLB Hit? __ N Page Fault? __ Y PPN: ____ TBD VPN ___ Physical Address CI CT 11 10 9 8 7 6 5 4 PPN CO 3 2 1 0 PPO CO ___ CI___ CT ____ Hit? __ Byte: ____ Cox Virtual Memory 44
Address Translation Example #3 Virtual Address: 0 x 0020 TLBT TLBI 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 1 0 0 0 VPN VPO 0 x 00 TLBI ___ 0 TLBT ____ 0 x 00 TLB Hit? __ N Page Fault? __ N PPN: ____ 0 x 28 VPN ___ Physical Address CI CT CO 11 10 9 8 7 6 5 4 3 2 1 0 1 0 0 0 0 0 PPN PPO 0 CI___ 0 x 8 CT ____ 0 x 28 Hit? __ N Byte: ____ In DRAM, don’t know value CO___ Cox Virtual Memory 45
Virtual Memory (video #2 C) Alan L. Cox alc@rice. edu Some slides adapted from CMU 15. 213 slides
Memory Mapping Creation of new VM area done via “memory mapping” w Create new vm_area_struct and page tables for area Area can be backed by (i. e. , get its initial values from) : w Regular file on disk (e. g. , an executable object file) • Initial page bytes come from a section of a file w Nothing (e. g. , . bss) • First fault will allocate a physical page full of 0's (demandzero) • Once the page is written to (dirtied), it is like any other page Dirty pages are swapped back and forth between a special swap area of the disk. Key point: no virtual pages are copied into physical memory until they are referenced! w Known as “demand paging” w Crucial for time and space efficiency Cox Virtual Memory 47
User-Level Memory Mapping void *mmap(void *start, size_t len, int prot, int flags, int fd, off_t offset) len bytes start (or address chosen by kernel) len bytes offset (bytes) Cox Disk file specified by file descriptor fd Process virtual memory Virtual Memory 48
User-Level Memory Mapping void *mmap(void *start, size_t len, int prot, int flags, int fd, off_t offset) Map len bytes starting at offset of the file specified by file descriptor fd, preferably at address start w start: may be 0 for “pick an address” w prot: PROT_READ, PROT_WRITE, . . . w flags: MAP_PRIVATE, MAP_SHARED, . . . Return a pointer to start of mapped area (may not be start) Example: fast file-copy w Useful for applications like Web servers that need to quickly copy files. w mmap()allows file transfers without copying into user space. Cox Virtual Memory 49
Servicing a Page Fault (1) Processor signals disk controller w Read block of length P starting at disk address X and store starting at memory address Y (2) Read occurs w Direct Memory Access (DMA) w Under control of I/O controller (3) Controller signals completion w Interrupts processor w OS resumes suspended (1) Initiate Block Read Processor Reg (3) Read Done Cache Memory-I/O bus (2) DMA Transfer Memory I/O controller Disk process Cox Virtual Memory 50
mmap() Example: Fast File Copy /* mmap. c - a program that uses mmap to copy itself to stdout */ /* include <unistd. h>, <sys/mman. h>, <sys/stat. h>, and <fcntl. h> */ int main(void) { struct stat; int fd; char *bufp; /* Open the file & get its size. */ fd = open(". /mmap. c", O_RDONLY); fstat(fd, &stat); /* Map the file to a new VM area. */ bufp = mmap(NULL, stat. st_size, PROT_READ, MAP_PRIVATE, fd, 0); /* Write the VM area to stdout. */ write(STDOUT_FILENO, bufp, stat. st_size); return (0); } Cox Virtual Memory 51
fork() Revisited To create a new process using fork(): w Make copies of the old process’ page table, etc. • The two processes now share all of their pages w Copy-on-write • Allows each process to have a separate address space without copying all of the virtual pages • Make pages of writeable areas read-only • Flag these areas as private “copy-on-write” in OS • Writes to these pages will cause protection faults – Fault handler recognizes copy-on-write, makes a copy of the page, and restores write permissions Net result: w Processes have identical address spaces w Copies are deferred until absolutely necessary Cox Virtual Memory 52
exec() Revisited To load p using exec: %rsp User Stack Shared Libraries demand-zero . data. text libc. so brk Heap Read/Write Data Cox demand-zero (. bss) Read-only Code and Data . data. rodata. text Unused p Virtual Memory w Delete existing page tables, etc. w Create new page tables, etc. : • Stack/heap/. bss are anonymous, demand-zero • Code and data is mapped to ELF executable file p w Shared libraries are dynamically linked and mapped w Set program counter to entry point in. text • OS will swap in pages from disk as they are used 53
Virtual Memory Supports many OS-related functions w Process creation • Initial • Forking children w Task switching w Protection/sharing Combination of hardware & software implementation w Software manages page tables, page allocations w Hardware reads page tables • Page fault when no entry w Hardware enforcement of protection • Protection fault when invalid access Cox Virtual Memory 54
Next Time System-Level I/O Cox Virtual Memory 55
- Alan cox rice
- Cox
- Virtual memory and cache memory
- Virtual memory in memory hierarchy consists of
- Yandex image search
- Video.search.yahoo.com search video
- Httptw
- Digital media primer
- Jamiroquai meaning
- Semantic prototype
- Implicit memory
- Long term memory vs short term memory
- Internal memory and external memory
- Primary memory and secondary memory
- Memory swaping
- Which memory is the actual working memory?
- Eidetic memory vs iconic memory
- Symmetric shared memory architecture
- Requi
- Belady's anomaly example
- Demand paging in virtual memory
- Virtual memory os
- Advantage
- Explain virtual memory in computer architecture
- Demand paging in virtual memory
- Virtual memory in os
- Virtual memory linux
- Virtual memory os
- Virtual memory
- Virtual memory segmentation
- Memory hierarchy
- Virtual memory os
- Virtual memory os
- Virtual memory is commonly implemented by
- Virtual memory
- Virtual memory os
- Process virtual address space
- Virtual memory
- Virtual memory
- Virtual memory layout
- Apa karakteristik dari memori virtual
- Csce 430
- Tlb computer architecture
- Reddit memory test
- Virtual memory
- Virtual address
- Virtual memory
- Virtual memory
- Virtual memory indirection
- Karakteristik memori virtual
- Implementasi virtual memory
- Shared virtual memory
- Virtual memory
- Virtual memory demand paging
- What is virtual memory
- Virtual memory address translation