Operating Systems ECE 344 Translation Lookaside Buffer Ashvin
Operating Systems ECE 344 Translation Lookaside Buffer Ashvin Goel ECE University of Toronto
Outline q Translation lookaside buffer -- TLB q Memory protection 2
Recall: Paging MMU q q q 00 000. . 00 111 01 000. . 01 111 10 000. . 10 111 11 000. . 11 111 q Partition virt. mem into pages & phys. mem into frames Translate virt. page to phys. frame on each memory access page & frame size always a power of 2 Then: bottom n bits identify offset within page E. g. page size = 8 bytes = 23 page/frame no E. g. , page size = 4 KB = 212 00 000. . 00 FFF 01 000. . 01 FFF 02 000. . 02 FFF 03 000. . 03 FFF q offset – 3 bits offset – 12 bits page/frame no 3
Recall: Linear Page Table 32 bit virtual address 20 bits 12 bits page number offset 0 x 1005 0 xa 6 f frames in memory Example architecture o 31 13 12 0 frame number unused C D R W V 0 xe 000 PTE o o o 0 x 1005 page_table (PTR) Single-level page table 0 Page size: 212 B (4 KB) Example translation o 0 xe v Virtual address size: 32 bits vaddr = 0 x 01005 a 6 f offset = vaddr & 0 xfff = 0 xa 6 f page = vaddr >> 12 = 0 x 1005 fr = page_table[page]. frame = 0 xe paddr = (fr << 12) | offset = 0 xea 6 f 4
Recall: Multi-Level Page Table 32 bit virtual address 10 bits PT 1 4 frames in memory 12 bits offset PT 2 5 0 xa 6 f 0 xe v 0 xe 000 5 0 xa 000 0 xa 4 PTR Top-level page table Second level page tables 0 5
Speeding up Address Translation q Paging MMU maps virtual to physical address on each memory access using page tables in memory o Every memory access requires additional accesses Linear Page Table vaddr = 0 x 01005 a 6 f offset = vaddr & 0 xfff = 0 xa 6 f pg = vaddr >> 12 = 0 x 1005 fr = ptr[pg]. frame = 0 xe paddr = (fr << 12) | offset = 0 xea 6 f q Two-Level Page Table vaddr = 0 x 01005 a 6 f offset = vaddr & 0 xfff = 0 xa 6 f pg 2 = (vaddr >> 12) & 0 x 3 ff = 0 x 5 pg 1 = vaddr >> (12+10) = 0 x 4 pt_2 = ptr[pg 1]. frame << 12 = 0 xa 000 fr = pt_2[pg 2]. frame = 0 xe paddr = (fr << 12) | offset = 0 xea 6 f This address translation can sped up by using a cache of page mappings 6
Translation Lookaside Buffer (TLB) q TLB is a h/w cache of page table entries (PTE) TLB has small nr. of entries (e. g. , 64) o TLB exploits locality, programs need few pages at a time o q Each TLB entry contains key: page number o Data: page table entry o TLB entry virtual page number (VPN) 31 12 11 frame number (PFN) unused R Page table entry 0 DCWV V: valid W: writeable C: Cacheable D: Dirty (set by hardware) R: Referenced (set by hardware) 7
TLB Operations q Similar to operations on any cache TLB lookup o TLB cache miss handling o TLB cache invalidate o 8
TLB Lookup CPU page nr p o p f TLB hit frame nr f TLB miss o Physical memory f p Page Table 9
TLB Cache Miss Handling q q q If TLB lookup fails, then page table is looked up for correct entry, and TLB cache is filled Choice of which TLB entry is replaced is called TLB replacement policy TLB cache misses can be handled by hardware or OS 10
TLB Miss in Hardware or Software q Hardware Managed TLB (x 86) TLB handles misses in hardware o Hardware defines page table format and uses page table register to locate page table in physical memory o TLB replacement policy fixed by hardware o q Software Managed TLB (MIPS, SPARC, HP PA) Hardware generates trap called TLB miss fault o OS handles TLB miss, similar to exception handling o § OS figures out the correct page table entry, adds it in TLB § CPU has instructions for modifying TLB o Page tables become entirely a OS data structure § H/w doesn’t have a page table register o TLB replacement policy managed in software 11
TLB Cache Invalidate q q q TLB is a cache of page table entries, needs to be kept consistent with page table Whenever OS modifies any page table entry, it needs to invalidate TLB entry On a context switch to another address space: TLB entries must be invalidated to prevent use of mappings of last address space o This is what really adds to the cost of context switching § Why? § We need a way to reduce this cost 12
TLB Cache Invalidate Options q Clear TLB Empty TLB by clearing the valid bit of all entries o Next thread will generate misses initially o Eventually, its caches enough of its own entries in the TLB o q Tagged TLB o Hardware maintains the current process id in a specific register § OS updates register on context switch Hardware maintains an id tag with each TLB entry o On TLB fill, TLB tag is assigned current process id o On TLB lookup, TLB hit occurs only when TLB tag matches current process id o § Enables space multiplexing of cache entries § Reduces need for invalidation significantly 13
Memory Protection q Each memory region may need different protection Text region: read, execute, no write o Stack and data regions: read, write, no execute o q We can use MMU to implement page-level protection o q Page table entry has protection bits (e. g. , page is writable) Page protection is enforced during address translation o E. g. , generate protection fault when read-only page is written protection bits 31 13 12 PFN unused R 0 DCWV Page table entry 14
Speeding Up Protection Enforcement q Checking page table on each memory access is slow o q TLB caches page-level protection bits TLB checks whether memory accesses are valid on each memory access If memory access is inconsistent with protection bits, TLB generates protection fault o OS needs to invalidate TLB entry when page protection is changed o Dirty 15
Software Managed TLB Faults q When a TLB fault occurs, it can be: o TLB miss fault: no matching page number was found § Read fault: A read was attempted but no entry matched § Write fault: A write was attempted but no entry matched o TLB protection fault: a matching page number was found, but protection bits are inconsistent with operation § Read-only fault: A write to a word in a page was attempted but write-bit was not set (page is marked read-only) § No-execute fault: The execution of an instruction in a page was attempted but the execute-bit was not set (page is marked nonexecutable)
Summary q q Paging MMU adds significant performance overhead because each memory access requires additional memory accesses to the page table TLB is a cache of page table entries o q TLB miss handling can be performed in h/w or s/w o q q Indexed by page number, returns frame number When performed in software, hardware only knows about TLB, not page table OS needs to invalidate a TLB entry when it modifies the corresponding page table entry Paging MMU can be used to provide page level memory protection 17
Think Time q What is the purpose of a TLB? q What is a TLB-miss fault? q What are the benefits of using a hardware managed TLB? q What are the benefits of a software managed TLB? q What is a protection fault? 18
- Slides: 18