Cache Memory Organization Direct Mapping Fully Associative Set





















- Slides: 21
Cache Memory Organization • • Direct Mapping Fully Associative Set Associative (very popular) Sector Mapping CSCI 232 © 2005 JW Ryder 1
Some Tools • M = 2 N - Address space size (usually physical) • B = 2 b - Line size (words/block) • S = 2 c - Cache size in blocks (2 b+c = Total # Words in cache) b words to a block 2 b = B S cache blocks 2 c = S b=2 c=2 S=4 B=4 M = 16 CSCI 232 © 2005 JW Ryder 2
Direct Mapping Cache N-c-b c b Word # in block Block # in S Block Tag • Block j in MM maps to block frame number j mod S • Previous example blocks 0, 4, 8, … map to block 0 • GT 1 MM block maps to a cache line, every 1/S MM block maps to same line • To see which MM block is in a cache frame, use the (N - c - b) bits as a block tag CSCI 232 © 2005 JW Ryder 3
Direct Continued • N = 8, b = 2, c = 2 N-c-b 0101 c b 10 01 01 0 1 10 S = 2 c 11 00 01 10 11 B = 2 b CSCI 232 © 2005 JW Ryder 4
Operation • Simultaneously – use middle c bit in addr to look up the tag register value in block frame – look up word in cache line c • Compare (N - c - b) tag register bits with value of addr – 128 blocks? j mod 128 ==> block # 3, 131, 259, etc. • Tags match means hit otherwise miss – accessed word suppressed – victim pre-selected. No choice with direct mapped caches. Line indicated by c is the one that must be the victim CSCI 232 © 2005 JW Ryder 5
Hardware Needed • Tag Registers, S (N - c - b) bit registers • 1 comparator to match tags • Clean/Dirty bit and hardware per block frame • No replacement hardware • Associative hardware for tag matching not needed • If control flip flops between MM blocks k and k + n. S where n Z, we have thrashing CSCI 232 © 2005 JW Ryder 6
Fully Associative N-b Block tag b Word Number • Block in MM can map to any cache frame • Leading (N - b) bits in addr stored as block tag • with each frame Comparison of block tags need to all be done at same time. Tag regs searched associatively with (N - b) bits as MM key. Contents of matching block frame only accessed if hit Cache set up as associative storage. (Content addressable memory) Allows fast access on hit CSCI 232 © 2005 JW Ryder 7
Fully Continued • Slower than direct mapped, no read ahead before match possible • Victim - Any line, need to maintain history of each line CSCI 232 © 2005 JW Ryder 8
Hardware • S comparators (N - b bits / comparator) • Block status bits (usage, clean/dirty) • We have victim choice • Costliest of all cache designs • Best cache utilization • Cycle time slower – Assoc search hardware • Permits wide variety of replacement algorithms CSCI 232 © 2005 JW Ryder 9
Fully Associative 00 01 S = 2 c 10 11 Direct Mapping: Cheap, poor performance in terms of replacement choices, fast cycle time Fully Associative: Expensive, very good performance in terms of replacement choices, slower cycle time (no early read out) CSCI 232 © 2005 JW Ryder 10
Set Associative Cache • Combination of Direct Mapped and Fully Associative Block 0 Set 0, K=0 Block 1 Block 2 Set 1, K=1 Block 3 Block 4 Set 2, K=2 Block 5 Block 6 Set 3, K=3 Block 7 CSCI 232 © 2005 JW Ryder 11
Set Assoc. Continued • • • S blocks divided into K sets K = 2 m S / K = 2 c / 2 m = 2 c-m c = 3, m = 2, S = 2 c = 8 blocks K = 2 m = 22 = 4 sets S / K = Blocks per Set = 8 / 4 = 2 • S / K = 2 c / 2 m = 2 c-m = 23 -2 = 21 =2 • This is an “S by K-way Set Associative Cache” – 2 -way Set Associative Cache CSCI 232 © 2005 JW Ryder 12
Set Assoc. Continued • Block # j in MM can be in any block frame (Fully Assoc. part) within set number j mod k (Direct Mapping part) N-m-b Block tag m Set # b Word # Set chosen by middle m bits in address CSCI 232 © 2005 JW Ryder 13
Set Assoc. Diagram b= 00 01 10 11 Set 00, K=0 0000 Set 01, K=1 1011 Set 10, K=2 Set 11, K=3 N - m - b bit block tags Addr. = 0000 N-m-b CSCI 232 01 10 m b © 2005 JW Ryder S = 2 c, c=3 K = 2 m, m=2 14
Diagram Continued • Map address to set 01 (m) • Read out S / K tags (2) and S / K lines (2) • Compare S / K tags - find match on 0000 in set 01 • Send word 10 (b) of already read out data to CPU from tag 0000 + set 01 • On Miss – Compare S / K tags and don’t find match on tags from set 01 – Suppress data lines read out from set 01 – Select victim from one of the S / K lines in set 01 CSCI 232 © 2005 JW Ryder 15
Pipelined Operations • Stage 1 – Bring out each line from a set into intermediate latches. (Tag & Data) – Each line within set is in a different bank so interleaving is possible for fast access • Stage 2 – Does associative compare CSCI 232 © 2005 JW Ryder 16
Hardware • S / K (N - m - b) bit comparators (assoc. logic) per set, S total • Status, clean/dirty bits and associated hardware • Registers to hold data and tags read out • Can pipeline tag/data read out and compare • Performance approaches that of Fully Associative cache CSCI 232 © 2005 JW Ryder 17
Sector Mapped Cache N-r-b Sector tag r Block # b Word # • MM divided into sectors with Q = 2 r blocks / sector • A validity tag is also associated with each block frame within a sector (in the cache) to indicate if contents of that block frame are valid or not CSCI 232 © 2005 JW Ryder 18
Sector Mapped Diagram Tags Block 0 Block 1 Sector 0 Block 2 r - 1 Block 0 Block 1 Sector 1 Block 2 r - 1 Q = 2 r blocks / sector CSCI 232 © 2005 JW Ryder r=2 19
Operation • Leading (N - r - b) bits used to associatively locate the sector in the cache • On hit – Use block # to locate block in sector (r) – Not real hit yet - only a sector hit – If validity tag is VALID, get word from block (b) else load block from MM (block miss) • On miss – Select victim ‘sector frame’ – Reset validity tags on all blocks within sector frame – After writing back any dirty blocks, load only missing block and set its validity bit on CSCI 232 © 2005 JW Ryder 20
Sector Mapping vs Set Associative Caches • Sector Mapping: – Associative mapping to sector frame – Direct Mapping to block • Set Associative: – Direct Mapping to set – Associative mapping to block within set • Associative Hardware – Similar to Set Associative – Not as easy to pipeline CSCI 232 © 2005 JW Ryder 21