CS 162 Operating Systems and Systems Programming Midterm

  • Slides: 48
Download presentation
CS 162 Operating Systems and Systems Programming Midterm Review March 7, 2011 Ion Stoica

CS 162 Operating Systems and Systems Programming Midterm Review March 7, 2011 Ion Stoica http: //inst. eecs. berkeley. edu/~cs 162

Synchronization, Critical section 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 2

Synchronization, Critical section 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 2

Definitions • Synchronization: using atomic operations to ensure cooperation between threads • Mutual Exclusion:

Definitions • Synchronization: using atomic operations to ensure cooperation between threads • Mutual Exclusion: ensuring that only one thread does a particular thing at a time – One thread excludes the other while doing its task • Critical Section: piece of code that only one thread can execute at once – Critical section is the result of mutual exclusion – Critical section and mutual exclusion are two ways of describing the same thing. 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 3

Locks: using interrupts • Key idea: maintain a lock variable and impose mutual exclusion

Locks: using interrupts • Key idea: maintain a lock variable and impose mutual exclusion only during operations on that variable int value = FREE; Acquire() { Release() { disable interrupts; if (anyone on wait queue) { if (value == BUSY) { take thread off wait queue put thread on wait queue; Place on ready queue; Go to sleep(); } else { // Enable interrupts? value = FREE; } else { } value = BUSY; enable interrupts; } } enable interrupts; } 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 4

Better locks: using test&set • test&set (&address) { /* most architectures */ result =

Better locks: using test&set • test&set (&address) { /* most architectures */ result = M[address]; M[address] = 1; return result; } int guard = 0; int value = FREE; Release() { Acquire() { // Short busy-wait time while (test&set(guard)); if anyone on wait queue { if (value == BUSY) { take thread off wait queue put thread on wait queue; Place on ready queue; go to sleep() & guard = 0; } else { value = FREE; value = BUSY; } guard = 0; } } 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 5

Semaphores • Semaphores are a kind of generalized lock – First defined by Dijkstra

Semaphores • Semaphores are a kind of generalized lock – First defined by Dijkstra in late 60 s – Main synchronization primitive used in original UNIX • Definition: a Semaphore has a non-negative integer value and supports the following two operations: – P(): an atomic operation that waits for semaphore to become positive, then decrements it by 1 » Think of this as the wait() operation – V(): an atomic operation that increments the semaphore by 1, waking up a waiting P, if any » This of this as the signal() operation – Note that P() stands for “proberen” (to test) and V() stands for “verhogen” (to increment) in Dutch 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 6

Semaphores Like Integers Except • Semaphores are like integers, except – No negative values

Semaphores Like Integers Except • Semaphores are like integers, except – No negative values – Only operations allowed are P and V – can’t read or write value, except to set it initially – Operations must be atomic » Two P’s together can’t decrement value below zero » Similarly, thread going to sleep in P won’t miss wakeup from V – even if they both happen at same time • Semaphore from railway analogy – Here is a semaphore initialized to 2 for resource control: Value=2 Value=0 Value=1 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 7

Condition Variables • Condition Variable: a queue of threads waiting for something inside a

Condition Variables • Condition Variable: a queue of threads waiting for something inside a critical section – Key idea: allow sleeping inside critical section by atomically releasing lock at time we go to sleep – Contrast to semaphores: Can’t wait inside critical section • Operations: – Wait(&lock): Atomically release lock and go to sleep. Reacquire lock later, before returning. – Signal(): Wake up one waiter, if any – Broadcast(): Wake up all waiters • Rule: Must hold lock when doing condition variable ops! 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 8

Complete Monitor Example (with condition variable) • Here is an (infinite) synchronized queue Lock

Complete Monitor Example (with condition variable) • Here is an (infinite) synchronized queue Lock lock; Condition dataready; Queue queue; Add. To. Queue(item) { lock. Acquire(); queue. enqueue(item); dataready. signal(); lock. Release(); } 3/7 // // Get Lock Add item Signal any waiters Release Lock Remove. From. Queue() { lock. Acquire(); // Get Lock while (queue. is. Empty()) { dataready. wait(&lock); // If nothing, sleep } item = queue. dequeue(); // Get next item lock. Release(); // Release Lock return(item); } Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 9

Mesa vs. Hoare monitors • Need to be careful about precise definition of signal

Mesa vs. Hoare monitors • Need to be careful about precise definition of signal and wait. Consider a piece of our dequeue code: while (queue. is. Empty()) { dataready. wait(&lock); // If nothing, sleep } item = queue. dequeue(); // Get next item – Why didn’t we do this? if (queue. is. Empty()) { dataready. wait(&lock); // If nothing, sleep } item = queue. dequeue(); // Get next item • Answer: depends on the type of scheduling – Hoare-style (most textbooks): » Signaler gives lock, CPU to waiter; waiter runs immediately » Waiter gives up lock, processor back to signaler when it exits critical section or if it waits again – Mesa-style (most real operating systems): 3/7 » Signaler keeps lock and processor » Waiter placed on ready queue with no special priority » Practically, need to check condition again after wait Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 10

Read/Writer Revisited Writer() { Reader() { // check into system lock. Acquire(); while ((AW

Read/Writer Revisited Writer() { Reader() { // check into system lock. Acquire(); while ((AW + AR) > 0) { while ((AW + WW) > 0) { WW++; WR++; ok. To. Write. wait(&lock); ok. To. Read. wait(&lock); WW--; } WR--; } AW++; lock. release(); AR++; lock. release(); // read/write access Access. Dbase(Read. Write); What if we // read-only access Access. Dbase(Read. Only); remove this // check out of system lock. Acquire(); line? AW--; // check out of system if (WW > 0){ lock. Acquire(); ok. To. Write. signal(); AR--; } else if (WR > 0) { if (AR == 0 && WW > 0) ok. To. Read. broadcast(); } ok. To. Write. signal(); lock. Release(); } } 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 11

Read/Writer Revisited Writer() { Reader() { // check into system lock. Acquire(); while ((AW

Read/Writer Revisited Writer() { Reader() { // check into system lock. Acquire(); while ((AW + AR) > 0) { while ((AW + WW) > 0) { WW++; WR++; ok. To. Write. wait(&lock); ok. To. Read. wait(&lock); WW--; } WR--; } AW++; lock. release(); AR++; lock. release(); // read/write access Access. Dbase(Read. Write); // read-only access What if we Access. Dbase(Read. Only); // check out of system turn signal to lock. Acquire(); AW--; // check out broadcast? of system if (WW > 0){ lock. Acquire(); ok. To. Write. signal(); AR--; } else if (WR > 0) { if (AR == 0 && WW > 0) ok. To. Read. broadcast(); } ok. To. Write. broadcast(); lock. Release(); } } 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 12

Read/Writer Revisited Writer() { Reader() { // check into system lock. Acquire(); while ((AW

Read/Writer Revisited Writer() { Reader() { // check into system lock. Acquire(); while ((AW + AR) > 0) { while ((AW + WW) > 0) { WW++; WR++; ok. Continue. wait(&lock); WW--; } WR--; } AW++; lock. release(); AR++; lock. release(); // read/write access Access. Dbase(Read. Write); // read-only access Access. Dbase(Read. Only); // check out of system lock. Acquire(); AW--; // check out of system if (WW > 0){ lock. Acquire(); ok. To. Write. signal(); AR--; } else if (WR > 0) { if (AR == 0 && WW > 0) ok. Continue. broadcast(); } ok. Continue. signal(); lock. Release(); } } 3/7 What if we turn ok. To. Write and ok. To. Read into ok. Continue? Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 13

Read/Writer Revisited Writer() { Reader() { // check into system lock. Acquire(); while ((AW

Read/Writer Revisited Writer() { Reader() { // check into system lock. Acquire(); while ((AW + AR) > 0) { while ((AW + WW) > 0) { WW++; WR++; ok. Continue. wait(&lock); WW--; } WR--; } AW++; lock. release(); AR++; lock. release(); // read/write access Access. Dbase(Read. Write); // read-only access Access. Dbase(Read. Only); // check out of system lock. Acquire(); AW--; // check out of system if (WW > 0){ lock. Acquire(); ok. To. Write. signal(); AR--; } else if (WR > 0) { if (AR == 0 && WW > 0) ok. Continue. broadcast(); } ok. Continue. signal(); lock. Release(); } } 3/7 • R 1 arrives • W 1, R 2 arrive while R 1 reads • R 1 signals R 2 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 14

Read/Writer Revisited Writer() { Reader() { // check into system lock. Acquire(); while ((AW

Read/Writer Revisited Writer() { Reader() { // check into system lock. Acquire(); while ((AW + AR) > 0) { while ((AW + WW) > 0) { WW++; WR++; ok. Continue. wait(&lock); WW--; } WR--; } AW++; lock. release(); AR++; lock. release(); // read/write access Access. Dbase(Read. Write); // read-only access Access. Dbase(Read. Only); // check out of system lock. Acquire(); AW--; // check out of system if (WW > 0){ lock. Acquire(); ok. To. Write. signal(); AR--; } else if (WR > 0) { if (AR == 0 && WW > 0) ok. Continue. broadcast(); } ok. Continue. broadcast(); lock. Release(); } } 3/7 Need to change to broadcast! Why? Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 15

Deadlock 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 16

Deadlock 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 16

Four requirements for Deadlock • Mutual exclusion – Only one thread at a time

Four requirements for Deadlock • Mutual exclusion – Only one thread at a time can use a resource. • Hold and wait – Thread holding at least one resource is waiting to acquire additional resources held by other threads • No preemption – Resources are released only voluntarily by the thread holding the resource, after thread is finished with it • Circular wait – There exists a set {T 1, …, Tn} of waiting threads » » 3/7 T 1 is waiting for a resource that is held by T 2 is waiting for a resource that is held by T 3 … Tn is waiting for a resource that is held by T 1 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 17

Resource Allocation Graph Examples • Recall: – request edge – directed edge T 1

Resource Allocation Graph Examples • Recall: – request edge – directed edge T 1 Rj – assignment edge – directed edge Rj Ti R 2 R 1 T 2 R 3 T 1 T 2 R 1 T 3 R 4 Simple Resource Allocation Graph 3/7 R 2 R 1 R 4 Allocation Graph With Deadlock Ion Stoica CS 162 ©UCB Spring 2011 T 2 T 3 R 2 T 4 Allocation Graph With Cycle, but Midterm Review. 18 No Deadlock

Deadlock Detection Algorithm • Only one of each type of resource look for loops

Deadlock Detection Algorithm • Only one of each type of resource look for loops • More General Deadlock Detection Algorithm – Let [X] represent an m-ary vector of non-negative integers (quantities of resources of each type): [Free. Resources]: [Request. X]: [Alloc. X]: Current free resources each type Current requests from thread X Current resources held by thread X – See if tasks can eventually terminate on their own [Avail] = [Free. Resources] Add all nodes to UNFINISHED do { done = true Foreach node in UNFINISHED { if ([Requestnode] <= [Avail]) { remove node from UNFINISHED [Avail] = [Avail] + [Allocnode] done = false } } } until(done) R 1 T 2 T 3 R 2 T 4 – Nodes left in UNFINISHED deadlocked 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 19

Banker’s Algorithm for Preventing Deadlock • Toward right idea: – State maximum resource needs

Banker’s Algorithm for Preventing Deadlock • Toward right idea: – State maximum resource needs in advance – Allow particular thread to proceed if: (available resources - #requested) max remaining that might be needed by any thread • Banker’s algorithm (less conservative): – Allocate resources dynamically » Evaluate each request and grant if some ordering of threads is still deadlock free afterward » Technique: pretend each request is granted, then run deadlock detection algorithm, substituting ([Maxnode]-[Allocnode] ≤ [Avail]) for ([Requestnode] ≤ [Avail]) Grant request if result is deadlock free (conservative!) » Keeps system in a “SAFE” state, i. e. there exists a sequence {T 1, T 2, … Tn} with T 1 requesting all remaining resources, finishing, then T 2 requesting all remaining resources, etc. . – Algorithm allows the sum of maximum resource needs of all current threads to be greater than total resources 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 20

Memory Multiplexing, Address Translation 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review.

Memory Multiplexing, Address Translation 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 21

Important Aspects of Memory Multiplexing • Controlled overlap: – Processes should not collide in

Important Aspects of Memory Multiplexing • Controlled overlap: – Processes should not collide in physical memory – Conversely, would like the ability to share memory when desired (for communication) • Protection: – Prevent access to private memory of other processes » Different pages of memory can be given special behavior (Read Only, Invisible to user programs, etc). » Kernel data protected from User programs » Programs protected from themselves • Translation: – Ability to translate accesses from one address space (virtual) to a different one (physical) – When translation exists, processor uses virtual addresses, physical memory uses physical addresses – Side effects: » Can be used to avoid overlap » Can be used to give uniform view of memory to programs 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 22

Why Address Translation? Data 2 Code Data Heap Stack 1 Heap 1 Code 1

Why Address Translation? Data 2 Code Data Heap Stack 1 Heap 1 Code 1 Stack 2 Prog 1 Virtual Address Space 1 Prog 2 Virtual Address Space 2 Data 1 Heap 2 Code 2 OS code Translation Map 1 OS data Translation Map 2 OS heap & Stacks 3/7 Physical Address Space Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 23

Addr. Translation: Segmentation vs. Paging Virtual Seg # Address Offset Base 0 Base 1

Addr. Translation: Segmentation vs. Paging Virtual Seg # Address Offset Base 0 Base 1 Base 2 Base 3 Base 4 Base 5 Base 6 Base 7 Virtual Address: Page # Page. Table. Ptr 3/7 Limit 0 Limit 1 Limit 2 Limit 3 Limit 4 Limit 5 Limit 6 Limit 7 V V V N N V > + Error Physical Address Offset page #0 page #1 page #2 page #3 page #4 page #5 V, R, W N V, R, W Ion Stoica CS 162 ©UCB Spring 2011 Physical Page # Offset Physical Address Check Perm Access Error Midterm Review. 24

Review: Address Segmentation Virtual memory view 1111 stack 1111 0000 Physical memory view 1110

Review: Address Segmentation Virtual memory view 1111 stack 1111 0000 Physical memory view 1110 000 1100 0000 1000 0000 heap Seg # base limit 00 0001 0000 10 0000 01 0101 0000 10 0111 0000 1 1000 11 1011 0000 0111 0000 0100 0000 0101 0000 data 0001 0000 code stack heap data code seg # offset 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 25

Review: Address Segmentation Virtual memory view 1111 1110 0000 Physical memory view stack 1100

Review: Address Segmentation Virtual memory view 1111 1110 0000 Physical memory view stack 1100 0000 What happens if stack grows to 1110 0000? heap 1110 000 Seg # base limit 00 0001 0000 10 0000 01 0101 0000 10 0111 0000 1 1000 11 1011 0000 1000 0111 0000 0100 0000 0101 0000 data 0001 0000 code stack heap data code seg # offset 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 26

Review: Address Segmentation Virtual memory view 1111 1110 0000 stack 1100 0000 1000 0000

Review: Address Segmentation Virtual memory view 1111 1110 0000 stack 1100 0000 1000 0000 Physical memory view heap 1110 000 Seg # base limit 00 0001 0000 10 0000 01 0101 0000 10 0111 0000 1 1000 11 1011 0000 No room to grow!! Buffer overflow error 0111 0000 0100 0000 0101 0000 data 0001 0000 code stack heap data code seg # offset 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 27

Review: Paging Virtual memory view 1111 stack 1111 0000 1100 0000 1000 0100 0000

Review: Paging Virtual memory view 1111 stack 1111 0000 1100 0000 1000 0100 0000 page # offset 3/7 heap data code Page Table 11111 11101 111101 null 11100 null 11011 null 11010 null 11001 null 11000 null 10111 null 10110 null 10101 null 10100 null 10011 null 10010 10001 01111 10000 01111 null 01110 null 01101 null 01100 null 01011 01101 01010 01100 01001 01011 01000 01010 00111 null 00110 null 00101 null 00100 null 00011 00101 000100 000011 Ion Stoica CS 162 ©UCB Spring 2011 00000 00010 Physical memory view stack heap data code 1110 0000 0111 000 0101 0001 0000 Midterm Review. 28

Review: Paging Virtual memory view 1111 stack 1110 0000 1100 0000 What happens if

Review: Paging Virtual memory view 1111 stack 1110 0000 1100 0000 What happens if stack grows to 1110 0000? heap 1000 0100 0000 page # offset 3/7 data code Page Table 11111 11101 111101 null 11100 null 11011 null 11010 null 11001 null 11000 null 10111 null 10110 null 10101 null 10100 null 10011 null 10010 10001 01111 10000 01111 null 01110 null 01101 null 01100 null 01011 01101 01010 01100 01001 01011 01000 01010 00111 null 00110 null 00101 null 00100 null 00011 00101 000100 000011 Ion Stoica CS 162 ©UCB Spring 2011 00000 00010 Physical memory view stack heap data code 1110 0000 0111 000 0101 0001 0000 Midterm Review. 29

Review: Paging Virtual memory view 1111 stack 1110 0000 1100 0000 1000 0100 0000

Review: Paging Virtual memory view 1111 stack 1110 0000 1100 0000 1000 0100 0000 page # offset 3/7 heap data code Page Table 11111 11101 111101 10111 11100 10110 11011 null 11010 null 11001 null 11000 null 10111 null 10110 null 10101 null 10100 null 10011 null 10010 10001 01111 10000 01111 null 01110 null 01101 null 01100 null 01011 01101 01010 01100 01001 01011 01000 01010 00111 null 00110 null 00101 null 00100 null 00011 00101 000100 000011 Ion Stoica CS 162 ©UCB Spring 2011 00000 00010 Physical memory view stack 1110 0000 stack Allocate new pages where heap room! 0111 000 data code 0101 0001 0000 Midterm Review. 30

Review: Two-Level Paging Virtual memory view 1111 Page Tables (level 2) stack 1110 0000

Review: Two-Level Paging Virtual memory view 1111 Page Tables (level 2) stack 1110 0000 Page Table (level 1) 1100 0000 1000 0100 0000 page 2 # 0000 heap 111 110 101 100 011 010 001 000 null 11 10 01 00 11101 11100 10111 10110 11 null 10 10000 01 01111 00 01110 null Physical memory view stack heap null data 11 10 01 00 01101 01100 01011 01010 code 11 10 01 00 00101 00100 00011 00010 1110 0000 data code 0111 000 0101 0001 0000 page 1 # offset 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 31

Review: Two-Level Paging Virtual memory view Page Tables (level 2) stack Page Table (level

Review: Two-Level Paging Virtual memory view Page Tables (level 2) stack Page Table (level 1) 1001 0000 3/7 heap 111 110 101 100 011 010 001 000 null 11 10 01 00 11101 11100 10111 10110 11 null 10 10000 01 01111 00 01110 null Physical memory view stack heap null data 11 10 01 00 01101 01100 01011 01010 code 11 10 01 00 00101 00100 00011 00010 Ion Stoica CS 162 ©UCB Spring 2011 1110 0000 1000 0000 data code 0001 0000 Midterm Review. 32

Review: Inverted Table Virtual memory view 1111 Physical memory view stack 1110 0000 Inverted

Review: Inverted Table Virtual memory view 1111 Physical memory view stack 1110 0000 Inverted Table hash(virt. page #) = physical page # 1100 0000 1000 0100 0000 heap data 111110 11101 11100 10010 10001 10000 01011 01010 01001 10000 00011 00010 00001 00000 11101 11100 10111 10110 10000 01111 01110 01101 01100 01011 01010 00101 00100 00011 00010 code stack 1110 0000 stack heap data code 0111 000 0101 0001 0000 page # offset 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 33

Address Translation Comparison Advantages Segmentation Fast context switching: Segment mapping maintained by CPU Disadvantages

Address Translation Comparison Advantages Segmentation Fast context switching: Segment mapping maintained by CPU Disadvantages External fragmentation • Large size: Table size ~ virtual memory • Internal fragmentation Paged • No external • Multiple memory segmentation fragmentation references per page • Table size ~ memory access Two-level used by program • Internal fragmentation pages Inverted Table Hash function more complex Paging (single No external -level page) fragmentation 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 34

Caches, TLBs 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 35

Caches, TLBs 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 35

Review: Sources of Cache Misses • Compulsory (cold start): first reference to a block

Review: Sources of Cache Misses • Compulsory (cold start): first reference to a block – “Cold” fact of life: not a whole lot you can do about it – Note: When running “billions” of instruction, Compulsory Misses are insignificant • Capacity: – Cache cannot contain all blocks access by the program – Solution: increase cache size • Conflict (collision): – Multiple memory locations mapped to same cache location – Solutions: increase cache size, or increase associativity • Two others: – Coherence (Invalidation): other process (e. g. , I/O) updates memory – Policy: Due to non-optimal replacement policy 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 36

Direct Mapped Cache • Cache index selects a cache block • “Byte select” selects

Direct Mapped Cache • Cache index selects a cache block • “Byte select” selects byte within cache block – Example: Block Size=32 B blocks • Cache tag fully identifies the cached data • Data with same “cache tag” shares the same cache entry – Conflict misses 31 8 Cache Index Cache Tag 4 0 Byte Select Ex: 0 x 01 Valid Bit Cache Data Byte 31 Byte 63 : : : Cache Tag : Byte 1 Byte 0 Byte 33 Byte 32 : Compare 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Hit Midterm Review. 37

Set Associative Cache • N-way set associative: N entries per Cache Index – N

Set Associative Cache • N-way set associative: N entries per Cache Index – N direct mapped caches operates in parallel • Example: Two-way set associative cache – Two tags in the set are compared to input in parallel – Data is selected based on the tag result 31 Cache Tag 8 Cache Index 4 0 Byte Select Valid Cache Tag Cache Data Cache Block 0 Cache Tag Valid : : : Compare Sel 1 1 Mux 0 Sel 0 Compare OR 3/7 Ion Stoica Hit CS 162 ©UCB Spring 2011 Cache Block Midterm Review. 38

Fully Associative Cache • Fully Associative: Every block can hold any line – Address

Fully Associative Cache • Fully Associative: Every block can hold any line – Address does not include a cache index – Compare Cache Tags of all Cache Entries in Parallel • Example: Block Size=32 B blocks – We need N 27 -bit comparators – Still have byte select to choose from within block 31 4 Cache Tag (27 bits long) Cache Tag Byte Select Ex: 0 x 01 Cache Data Valid Bit Byte 31 Byte 0 Byte 63 Byte 32 : : = 0 = = 3/7 : : Ion Stoica CS 162 ©UCB Spring 2011 : Midterm Review. 39

Where does a Block Get Placed in a Cache? • Example: Block 12 placed

Where does a Block Get Placed in a Cache? • Example: Block 12 placed in 8 block cache 32 -Block Address Space: Block no. 111112222233 0123456789012345678901 Direct mapped: Set associative: Fully associative: block 12 (01100) can go only into block 4 (12 mod 8) block 12 can go anywhere in set 0 (12 mod 4) block 12 can go anywhere Block no. 01234567 01 100 3/7 tag index Block no. 01234567 Block no. Set Set 011 00 0 1 2 3 Ion Stoica CS 162 ©UCB Spring 2011 tag 01234567 index 01100 tag Midterm Review. 40

Review: Caching Applied to Address Translation • Problem: address translation expensive (especially multi-level) •

Review: Caching Applied to Address Translation • Problem: address translation expensive (especially multi-level) • Solution: cache address translation (TLB) – Instruction accesses spend a lot of time on the same page (since accesses sequential) – Stack accesses have definite locality of reference – Data accesses have less page locality, but still some… CPU Virtual Address TLB Cached? Yes No e t v l Sa esu R Translate (MMU) Physical Address Physical Memory Data Read or Write (untranslated) 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 41

TLB organization • How big does TLB actually have to be? – Usually small:

TLB organization • How big does TLB actually have to be? – Usually small: 128 -512 entries – Not very big, can support higher associativity • TLB usually organized as fully-associative cache – Lookup is by Virtual Address – Returns Physical Address • What happens when fully-associative is too slow? – Put a small (4 -16 entry) direct-mapped cache in front – Called a “TLB Slice” • When does TLB lookup occur? – Before cache lookup? – In parallel with cache lookup? 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 42

Reducing translation time further • As described, TLB lookup is in serial with cache

Reducing translation time further • As described, TLB lookup is in serial with cache lookup: Virtual Address 10 offset V page no. TLB Lookup V Access Rights PA P page no. offset 10 Physical Address • Machines with TLBs go one step further: they overlap TLB lookup with cache access. – Works because offset available early 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 43

Overlapping TLB & Cache Access • Here is how this might work with a

Overlapping TLB & Cache Access • Here is how this might work with a 4 K cache: assoc lookup 32 index TLB 4 K Cache 10 2 disp 00 20 page # 1 K 4 bytes Hit/ Miss PA = PA Data Hit/ Miss • What if cache size is increased to 8 KB? – Overlap not complete – Need to do something else. See CS 152/252 • Another option: Virtual Caches 3/7 – Tags in cache are virtual addresses – Translation only happens on cache misses Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 44

Putting Everything Together 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 45

Putting Everything Together 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 45

Paging & Address Translation Virtual Address: Virtual P 1 index P 2 index Offset

Paging & Address Translation Virtual Address: Virtual P 1 index P 2 index Offset Page. Table. Ptr Physical Memory: Physical Address: Physical Page # Offset Page Table (1 st level) Page Table (2 nd level) 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 46

Translation Look-aside Buffer Virtual Address: Virtual P 1 index P 2 index Offset Page.

Translation Look-aside Buffer Virtual Address: Virtual P 1 index P 2 index Offset Page. Table. Ptr Physical Memory: Physical Address: Physical Page # Offset Page Table (1 st level) Page Table (2 nd level) TLB: … 3/7 Ion Stoica CS 162 ©UCB Spring 2011 Midterm Review. 47

Caching Physical Memory: Virtual Address: Virtual P 1 index P 2 index Offset Page.

Caching Physical Memory: Virtual Address: Virtual P 1 index P 2 index Offset Page. Table. Ptr Physical Address: Physical Page # Offset Page Table (1 st level) Page Table (2 nd level) tag index byte cache: tag: block: TLB: … 3/7 Ion Stoica CS 162 ©UCB Spring 2011 … Midterm Review. 48