Virtual Memory Address Translation Vivek Pai Princeton University

  • Slides: 31
Download presentation
Virtual Memory & Address Translation Vivek Pai Princeton University Oct 9, 2001 Virtual Memory

Virtual Memory & Address Translation Vivek Pai Princeton University Oct 9, 2001 Virtual Memory & Translation

General Memory Problem • We have a limited (expensive) physical resource: main memory •

General Memory Problem • We have a limited (expensive) physical resource: main memory • We want to use it as efficiently as possible • We have an abundant, slower resource: disk Oct 9, 2001 Virtual Memory & Translation 2

Lots of Variants • Many programs, total size less than memory – Technically possible

Lots of Variants • Many programs, total size less than memory – Technically possible to pack them together – Will programs know about each other’s existence? • One program, using lots of memory – Can you only keep part of the program in memory? Virtual Memory & Translation • Lots of programs, total size exceeds Oct 9, 2001 3

History Versus Present • History – Each variant had its own solution – Solutions

History Versus Present • History – Each variant had its own solution – Solutions have different hardware requirements – Some solutions software/programmer visible • Present – general-purpose microprocessors – One mechanism used for all of these cases Virtual Memory & Translation • Present – less capable microprocessors 4 Oct 9, 2001

Many Programs, Small Total Size • Observation: we can pack them into memory •

Many Programs, Small Total Size • Observation: we can pack them into memory • Requirements by segments – Text: maybe contiguous – Data: keep contiguous, “relocate” at start – Stack: assume contiguous, fixed size • Just set pointer at start, reserve space – Heap: no need to make it contiguous Oct 9, 2001 Virtual Memory & Translation 5

Many Programs, Small Total Size • Software approach – Just find appropriate space for

Many Programs, Small Total Size • Software approach – Just find appropriate space for data & code segments – Adjust any pointers to globals/functions in the code – Heap, stack “automatically” adjustable • Hardware approach – Pointer to data segment – All accesses to globals indirected Oct 9, 2001 Virtual Memory & Translation 6

One Program, Lots of Memory • Observations: locality – Instructions in a function generally

One Program, Lots of Memory • Observations: locality – Instructions in a function generally related – Stack accesses generally in current stack frame – Not all globals used all the time • Goal: keep recently-used portions in memory – Explicit: programmer/compiler reserves, controls part of memory space – “overlays” Oct 9, – 2001 Memory & Translation 7 Note: limited. Virtual resource may be address

Many Programs, Lots of Memory • Software approach – Keep only subset of programs

Many Programs, Lots of Memory • Software approach – Keep only subset of programs in memory – When loading a program, evict any programs that use the same memory regions – “Swap” programs in/out as needed • Hardware approach – Don’t permanently associate any address of any program to any part of physical memory • Note: doesn’t address problem of too few address bits Virtual Memory & Translation Oct 9, 2001 8

Why Virtual Memory? • Use secondary storage($) – Extend DRAM($$$) with reasonable performance •

Why Virtual Memory? • Use secondary storage($) – Extend DRAM($$$) with reasonable performance • Protection – Programs do not step over each other – Communications require explicit IPC operations • Convenience – Flat address space Oct 9, 2001 Virtual Memory & Translation – Programs have the same view of the world 9

How To Translate • Must have some “mapping” mechanism • Mapping must have some

How To Translate • Must have some “mapping” mechanism • Mapping must have some granularity – Granularity determines flexibility – Finer granularity requires more mapping info • Extremes: – Any byte to any byte: mapping equals program size Oct 9, – 2001 Virtual Memory & Translation Map whole segments: larger segments 10

Translation Options • Granularity – Small # of big fixed/flexible regions – segments –

Translation Options • Granularity – Small # of big fixed/flexible regions – segments – Large # of fixed regions – pages • Visibility – Translation mechanism integral to instruction set – segments – Mechanism partly visible, external to processor – obsolete – Mechanism part of processor, visible to OS – pages Oct 9, 2001 Virtual Memory & Translation 11

Translation Overview CPU virtual address Translation (MMU) physical address Physical memory I/O device •

Translation Overview CPU virtual address Translation (MMU) physical address Physical memory I/O device • Actual translation is in hardware (MMU) • Controlled in software • CPU view – what program sees, virtual memory • Memory view Oct 9, 2001 – physical memory Virtual Memory & Translation 12

Goals of Translation • Implicit translation for each memory reference • A hit should

Goals of Translation • Implicit translation for each memory reference • A hit should be very fast • Trigger an exception on a miss • Protected from user’s faults Oct 9, 2001 Registers Cache(s) 10 x DRAM 100 x Disk 10 Mx paging Virtual Memory & Translation 13

Base and Bound virtual address + physical address Oct 9, 2001 • Built in

Base and Bound virtual address + physical address Oct 9, 2001 • Built in Cray-1 bound • A program can only access physical > memory in error [base, base+bound] base • On a context switch: save/restore base, bound registers • Pros: Simple • Cons: fragmentation, hard to share, and 14 Virtual Memory & Translation difficult to use disks

Segmentation Virtual address segment offset seg size . . . + physical address Oct

Segmentation Virtual address segment offset seg size . . . + physical address Oct 9, 2001 • Have a table of (seg, size) > error • Protection: each entry has – (nil, read, write, exec) • On a context switch: save/restore the table or a pointer to the table in kernel memory • Pros: Efficient, easy to share Virtual Memory & Translation 15 • Cons: Complex

Paging Virtual address VPage # offset page table size error • Use a page

Paging Virtual address VPage # offset page table size error • Use a page table to > translate Page table • Various bits in each PPage#. . . entry. . . • Context switch: . similar to the PPage#. . . segmentation scheme • What should be the PPage # offset page size? Physical address • Pros: simple allocation, easy to 16 Oct 9, 2001 Virtual Memory & Translation share

How Many PTEs Do We Need? • Assume 4 KB page – Equals “low

How Many PTEs Do We Need? • Assume 4 KB page – Equals “low order” 12 bits • Worst case for 32 -bit address machine – # of processes 220 • What about 64 -bit address machine? – # of processes 252 Oct 9, 2001 Virtual Memory & Translation 17

Segmentation with Paging Virtual address Vseg # seg size . . . VPage #

Segmentation with Paging Virtual address Vseg # seg size . . . VPage # offset Page table PPage#. . . PPage# > error Oct 9, 2001 . . . PPage # offset Physical address Virtual Memory & Translation 18

Multiple-Level Page Tables Virtual address dir table offset pte . . . Directory .

Multiple-Level Page Tables Virtual address dir table offset pte . . . Directory . . What does this buy us? Sparse address spaces and easier pagi Oct 9, 2001 Virtual Memory & Translation 19

Inverted Page Tables Physical address Virtual address pid vpage offset k 0 pid vpage

Inverted Page Tables Physical address Virtual address pid vpage offset k 0 pid vpage k offset • Main idea – One PTE for each physical page frame – Hash (Vpage, pid) to Ppage# • Pros – Small page table for large address space • Cons – Lookup is difficult – Overhead of Inverted page table managing hash Oct 9, 2001 Virtual Memory & Translation chains, etc n-1 20

Virtual-To-Physical Lookups • Programs only know virtual addresses • Each virtual address must be

Virtual-To-Physical Lookups • Programs only know virtual addresses • Each virtual address must be translated – May involve walking hierarchical page table – Page table stored in memory – So, each program memory access requires several actual memory accesses • Solution: cache “active” part of page table Oct 9, 2001 Virtual Memory & Translation 21

Translation Look-aside Buffer (TLB) Virtual address VPage # offset VPage# PPage#. . VPage# PPage#

Translation Look-aside Buffer (TLB) Virtual address VPage # offset VPage# PPage#. . VPage# PPage# . . . Miss Real page table TLB Hit PPage # offset Physical address Oct 9, 2001 Virtual Memory & Translation 22

Bits in A TLB Entry • Common (necessary) bits – Virtual page number: match

Bits in A TLB Entry • Common (necessary) bits – Virtual page number: match with the virtual address – Physical page number: translated address – Valid – Access bits: kernel and user (nil, read, write) • Optional (useful) bits – Process tag – Reference – Modify – Cacheable Oct 9, 2001 Virtual Memory & Translation 23

Hardware-Controlled TLB • On a TLB miss – Hardware loads the PTE into the

Hardware-Controlled TLB • On a TLB miss – Hardware loads the PTE into the TLB • Need to write back if there is no free entry – Generate a fault if the page containing the PTE is invalid – VM software performs fault handling – Restart the CPU • On a TLB hit, hardware checks the valid bit – If valid, pointer to page frame in memory – If invalid, the hardware generates a page fault Oct 9, 2001 • Perform page fault handling Virtual Memory & Translation • Restart the faulting instruction 24

Software-Controlled TLB • On a miss in TLB – – – Write back if

Software-Controlled TLB • On a miss in TLB – – – Write back if there is no free entry Check if the page containing the PTE is in memory If no, perform page fault handling Load the PTE into the TLB Restart the faulting instruction • On a hit in TLB, the hardware checks valid bit – If valid, pointer to page frame in memory – If invalid, the hardware generates a page fault Oct 9, 2001 • Perform page fault handling • Restart the faulting instruction Virtual Memory & Translation 25

Hardware vs. Software Controlled • Hardware approach – Efficient – Inflexible – Need more

Hardware vs. Software Controlled • Hardware approach – Efficient – Inflexible – Need more space for page table • Software approach – Flexible – Software can do mappings by hashing • PP# (Pid, VP#) • (Pid, VP#) PP# – Can deal with large virtual address space Oct 9, 2001 Virtual Memory & Translation 26

Cache vs. TLBs • Similarities • Differences – Associativity – Both cache a portion

Cache vs. TLBs • Similarities • Differences – Associativity – Both cache a portion • TLB is usually fully of memory set-associative – Both write back on a • Cache can be directmiss mapped • Combine L 1 cache with – Consistency • TLB does not deal TLB with consistency with – Virtually addressed memory cache • TLB can be controlled – Why wouldn’t by software everyone use Virtual Memory & Translation Oct 9, 2001 27 virtually addressed

Caches vs. TLBs Similarities Differences • Both cache a portion of • Associativity memory

Caches vs. TLBs Similarities Differences • Both cache a portion of • Associativity memory – TLBs generally fully associative • Both read from memory – Caches can be directon misses mapped • Consistency – No TLB/memory Combining L 1 caches with TLBs consistency – Some TLBs software • Virtually addressed caches controlled • Not always used – what are their Oct 9, 2001 Virtual Memory & Translation drawbacks? 28

Issues • What TLB entry to be replaced? – Random – Pseudo LRU •

Issues • What TLB entry to be replaced? – Random – Pseudo LRU • What happens on a context switch? – Process tag: change TLB registers and process register – No process tag: Invalidate the entire TLB contents • What happens when changing a page table entry? – Change the entry in memory Invalidate the TLB entry& Translation Oct 9, – 2001 Virtual Memory 29

Consistency Issues • Snoopy cache protocols can maintain consistency with DRAM, even when DMA

Consistency Issues • Snoopy cache protocols can maintain consistency with DRAM, even when DMA happens • No hardware maintains consistency between DRAM and TLBs: you need to flush related TLBs whenever changing a page table entry in memory • On multiprocessors, when you modify a page table entry, you need to do “TLB shoot-down” to 2001 flush all related Oct 9, Virtual TLB Memory &entries Translation on all 30 processors

Issues to Ponder • Everyone’s moving to hardware TLB management – why? • Segmentation

Issues to Ponder • Everyone’s moving to hardware TLB management – why? • Segmentation was/is a way of maintaining backward compatibility – how? • For the hardware-inclined – what kind of hardware support is needed for everything we discussed today? Oct 9, 2001 Virtual Memory & Translation 31