Consider a Direct Mapped Cache with 4 word

  • Slides: 57
Download presentation

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 80 6 7 8 9 81 Block Address Cache Address Hit or Miss

31. . . 16 15. . . 4 3 2 1 0 Tag Index

31. . . 16 15. . . 4 3 2 1 0 Tag Index Address 25 3 v Tag Word 3 Word 2 32 32 16 = Byte Offset Block Offset 2 Word 1 32 Word 0 32 Mux Hit 32 Data 8 Entries

Block Address 0 1 2 3 3 7 11 15 2 6 10 14

Block Address 0 1 2 3 3 7 11 15 2 6 10 14 1 5 9 13 0 4 8 12 Word Addr 4 7 8 31 35 30 34 29 33 28 32 15 63 62 61 60 X 4 X+3 4 X+2 4 X+1 4 X Word Address

Block Address 0 1 2 3 3 7 11 15 2 6 10 14

Block Address 0 1 2 3 3 7 11 15 2 6 10 14 1 5 9 13 0 4 8 12 Word Addr 4 7 8 31 35 30 34 29 33 28 32 15 63 62 61 60 X 4 X+3 4 X+2 4 X+1 4 X Word Address Cache Address 0 1 2 3 7

Block Address 0 1 2 3 3 7 11 15 2 6 10 14

Block Address 0 1 2 3 3 7 11 15 2 6 10 14 1 5 9 13 0 4 8 12 Cache Address 0 1 2 3 Word Addr 4 7 8 31 35 30 34 29 33 28 32 7 0 15 63 62 61 60 7 X 4 X+3 4 X+2 4 X+1 4 X Word Address

Block Address 0 1 2 3 3 7 11 15 2 6 10 14

Block Address 0 1 2 3 3 7 11 15 2 6 10 14 1 5 9 13 0 4 8 12 Cache Address 0 1 2 3 Word Addr 4 7 8 31 35 30 34 29 33 28 32 7 0 15 63 62 61 60 7 X 4 X+3 4 X+2 4 X+1 4 X Word Address X Modulo 8

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 80 6 7 8 9 81 Block Address Cache Address Hit or Miss Cache Address =( Word Addr ) modulo 8 4

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 80 6 7 8 9 81 Block Address Cache Address 1 1 Hit or Miss Cache Address =( Word Addr ) modulo 8 4

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 80 6 7 8 9 81 Block Address Cache Address 1 1 Hit or Miss Hit Cache Address =( Word Addr ) modulo 8 4

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 80 6 7 8 9 81 Block Address Cache Address 1 1 2 2 Hit or Miss Hit Miss Cache Address =( Word Addr ) modulo 8 4

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 80 6 7 8 9 81 Block Address Cache Address 1 1 2 2 Hit or Miss Hit Cache Address =( Word Addr ) modulo 8 4

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 80 6 7 8 9 81 Block Address Cache Address 1 1 2 2 20 4 Hit or Miss Hit Miss Cache Address =( Word Addr ) modulo 8 4

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 80 6 7 8 9 81 Block Address Cache Address 1 1 1 1 2 2 2 2 20 4 Hit or Miss Hit Hit Hit Cache Address =( Word Addr ) modulo 8 4

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 68 6 7 8 9 69 Block Address Cache Address 1 1 2 2 Hit or Miss Hit 1 1 2 2 Cache Address =( Word Addr ) modulo 8 4

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 68 6 7 8 9 69 Block Address Cache Address 1 1 2 2 17 1 1 1 2 2 Hit or Miss Hit Miss Cache Address =( Word Addr ) modulo 8 4

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 68 6 7 8 9 69 Block Address Cache Address 1 1 2 2 17 1 1 1 2 2 Hit or Miss Hit Miss Hit Hit Cache Address =( Word Addr ) modulo 8 4

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks

Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 68 6 7 8 9 69 Block Address Cache Address 1 1 1 1 2 2 2 2 17 1 Hit or Miss Hit Miss Hit Hit Miss Cache Address =( Word Addr ) modulo 8 4

How about putting a block in any unused block of the eight blocks? Tag

How about putting a block in any unused block of the eight blocks? Tag Word 3 Word 2 Word 1 Word 0

How about putting a block in any unused block of the eight blocks? Tag

How about putting a block in any unused block of the eight blocks? Tag Word 3 Word 2 How can you find it? Word 1 Word 0

How about putting a block in any unused block of the eight blocks? Tag

How about putting a block in any unused block of the eight blocks? Tag Word 3 Word 2 Word 1 Word 0 How can you find it? Expand the Tag to the block address and compare

How about putting a block in any unused block of the eight blocks? Address

How about putting a block in any unused block of the eight blocks? Address Block Address – 28 bits Tag Word 3 Word 2 Word 1 Word 0 Fully Associative Memory – Addressed by it’s contents

Fully Associative Memory – Addressed by it’s contents Block Offset Address Block Address –

Fully Associative Memory – Addressed by it’s contents Block Offset Address Block Address – 28 bits Byte Offset • For practical Hit time, must have parallel comparisons of the Tag and the Block Address • Only feasible for small number of blocks

Fully Associative Memory – Addressed by it’s contents Block Offset Block Address – 28

Fully Associative Memory – Addressed by it’s contents Block Offset Block Address – 28 bits Address Tag Data Byte Offset Tag Data Blk Addr = = + Mux Valid bit not shown Hit Data Block Offset selects Word

Fully Associative Memory – Addressed by it’s contents Block Offset Block Address – 28

Fully Associative Memory – Addressed by it’s contents Block Offset Block Address – 28 bits Address Tag Data Byte Offset Tag Data Blk Addr = = + Mux Valid bit not shown Hit Data Hardware Not Feasible for large Cache

Make sets of Blocks Associative Valid bit not shown Two-way set associative Index 0

Make sets of Blocks Associative Valid bit not shown Two-way set associative Index 0 1. . . 2 k-1 Tag 0 Data 0 Tag 1 Data 1 • Addr by Index • Compare Two Tags in parallel for Hit

Make sets of Blocks Associative Valid bit not shown Two-way set associative Index 0

Make sets of Blocks Associative Valid bit not shown Two-way set associative Index 0 1. . . Tag 0 Data 0 Tag 1 Data 1 • Addr by Index • Compare Two Tags in parallel for Hit 2 k-1 Address Block Offset Tag Index Byte Offset

Block replacement strategies For each Index there are 2, 4, . . . n

Block replacement strategies For each Index there are 2, 4, . . . n options for replacement. Strategies 1. LRU – Least Recently Used • Replace the block that has been unused for the longest time • Implementation

Block replacement strategies For each Index there are 2, 4, . . . n

Block replacement strategies For each Index there are 2, 4, . . . n options for replacement Strategies 1. LRU – Least Recently Used • Replace the block that has been unused for the longest time 2. Random • Select the block to be replaced randomly • Implementation

Consider a Two Way Associative Cache with 4 word blocks with size of 8

Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address(Set) Hit or Miss Entry 0 Entry 1 6 7 8 9 68 6 7 8 9 69 Cache Address =( Word Addr ) modulo 4 4

Consider a Two Way Associative Cache with 4 word blocks with size of 8

Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 68 6 7 8 9 69 Block Address Cache Address(Set) Hit or Miss Entry 0 Entry 1 1 1 Miss 1 1 Hit 2 2 Miss 2 2 Hit Cache Address =( Word Addr ) modulo 4 4

Consider a Two Way Associative Cache with 4 word blocks with size of 8

Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 68 6 7 8 9 69 Block Address Cache Address(Set) Hit or Miss Entry 0 Entry 1 1 1 Miss 1 1 Hit 2 2 Miss 2 2 Hit 17 1 Miss Cache Address =( Word Addr ) modulo 4 4

Consider a Two Way Associative Cache with 4 word blocks with size of 8

Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address 6 7 8 9 68 6 7 8 9 69 Block Address Cache Address(Set) Hit or Miss Entry 0 Entry 1 1 1 Miss 1 1 Hit 2 2 Miss 2 2 Hit 17 1 Miss 1 1 Hit 2 2 Hit 17 1 Hit Cache Address =( Word Addr ) modulo 4 4

Make sets of Blocks Associative Four-way set associative Index 0 Tag 0 Data 0

Make sets of Blocks Associative Four-way set associative Index 0 Tag 0 Data 0 Tag 1 1 Valid bit not shown Data 1 Tag 2 Data 2 Tag 3 Data 3 . . . 2 m-1 • Addr by Index • Compare Four Tags in parallel for Hit

Make sets of Blocks Associative Four-way set associative Index 0 Tag 0 Data 0

Make sets of Blocks Associative Four-way set associative Index 0 Tag 0 Data 0 Tag 1 1 Valid bit not shown Data 1 Tag 2 Data 2 Tag 3 Data 3 . . . 2 m-1 Address Block Offset Tag Index Byte Offset

Make sets of Blocks Associative Four-way set associative Index 0 Tag 0 Data 0

Make sets of Blocks Associative Four-way set associative Index 0 Tag 0 Data 0 Tag 1 1 Valid bit not shown Data 1 Tag 2 Data 2 Tag 3 Data 3 . . . 2 m-1 Address Block Offset Tag Can generalize to n-way associative Index Byte Offset

DECStation 3100 with 64 KB instruction cache and 64 KB data cache each with

DECStation 3100 with 64 KB instruction cache and 64 KB data cache each with 4 word block size Program = gcc Associativity 1 2 4 Instruction miss rate 2. 0% 1. 6% Data miss rate 1. 7% 1. 4% Combined miss rate 1. 9% 1. 5%

Block Offset 2 Four-way set associative Address 32 bit v Tag Index v Tag

Block Offset 2 Four-way set associative Address 32 bit v Tag Index v Tag 0 Data 0 Tag 1 v Data 1 Byte Offset v Tag 2 Data 2 Tag 3 Data 3

1. Number of Blocks = 2 n Select 4, then n = 2

1. Number of Blocks = 2 n Select 4, then n = 2

Block Offset 2 2 Four-way set associative Address 32 bit v Tag Index v

Block Offset 2 2 Four-way set associative Address 32 bit v Tag Index v Tag 0 Data 0 Tag 1 v Data 1 Byte Offset v Tag 2 Data 2 Tag 3 Data 3

1. Number of Blocks = 2 n Select 4, then n = 2 2.

1. Number of Blocks = 2 n Select 4, then n = 2 2. Select number of entries in the cache ( power of 2) If 256, then Index is 8 bits.

1. Number of Blocks = 2 n Select 4, then n = 2 2.

1. Number of Blocks = 2 n Select 4, then n = 2 2. Select number of entries in the cache ( power of 2) If 256, then Index is 8 bits. Cache has 256 x 4 blocks = 1 K blocks = 1 K blocks x 4 words/ block = 4 K words = 16 KB

1. Number of Blocks = 2 n Select 4, then n = 2 2.

1. Number of Blocks = 2 n Select 4, then n = 2 2. Select number of entries in the cache ( power of 2) If 256, then Index is 8 bits. Cache has 256 x 4 blocks = 1 K blocks = 1 K blocks x 4 words/ block = 4 K words = 16 KB 3. Tag = 32 – 2 – 8 = 20 bits Each entry has 4 x ( 1 + 20 + 128 ) bits = 4 x 149 = 596 bits Total Cache Memory = 256 x 596 bits = 152576 bits = 149 K bits

Block Offset 2 2 Four-way set associative Address 32 bit Index Byte Offset 8

Block Offset 2 2 Four-way set associative Address 32 bit Index Byte Offset 8 v 0 1. . . 20 Tag v v Tag 0 Data 0 Tag 1 Data 1 v Tag 2 Data 2 Tag 3 Data 3 = = Hit 0 Hit 1 Hit 2 Hit 3 255

Block Offset 2 2 Four-way set associative Address 32 bit Index Byte Offset 8

Block Offset 2 2 Four-way set associative Address 32 bit Index Byte Offset 8 v 0 1. . . 20 Tag v v Tag 0 Data 0 Tag 1 = = Data 1 v Tag 2 Data 2 = Tag 3 Data 3 = 255 Hit 0 Hit 1 Hit 2 MISS Hit 3 4 OPTIONS

LRU Approximation Add the following three bits to each entry of the cache MRR(0)

LRU Approximation Add the following three bits to each entry of the cache MRR(0) = 1 if Data 0 or Data 1 Read Last =0 if Data 2 or Data 3 Read Last MRR(1) = 1 =0 if Data 1 Read Last If Data 0 Read Last MRR(2) = 1 =0 if Data 2 Read Last if Data 3 Read Last

LRU Approximation Add the following three bits to each entry of the cache MRR(0)

LRU Approximation Add the following three bits to each entry of the cache MRR(0) = 1 if Data 0 or Data 1 Read Last =0 if Data 2 or Data 3 Read Last MRR(1) = 1 =0 if Data 1 Read Last If Data 0 Read Last MRR(2) = 1 if Data 2 Read Last =0 if Data 3 Read Last LRU Approximation If MRR(0) = 1, then choose Data 2, Data 3 pair If MRR(2) = 1, then choose Data 3 as LRU

LRU Approximation Add the following three bits to each entry of the cache MRR(0)

LRU Approximation Add the following three bits to each entry of the cache MRR(0) = 1 if Data 0 or Data 1 Read Last =0 if Data 2 or Data 3 Read Last MRR(1) = 1 =0 if Data 1 Read Last If Data 0 Read Last MRR(2) = 1 if Data 2 Read Last =0 if Data 3 Read Last LRU Approximation If MRR(0) = 1, then choose Data 2, Data 3 pair If MRR(2) = 1, then choose Data 3 as LRU Note the LRU could have been Data 0 or Data 1.

Block Offset 2 2 Four-way set associative Address 32 bit Index Byte Offset 8

Block Offset 2 2 Four-way set associative Address 32 bit Index Byte Offset 8 v 0 1. . . 20 Tag v v Tag 0 Data 0 Tag 1 = = Data 1 v Tag 2 Data 2 = Tag 3 Data 3 = 255 Write Hit 0 Hit 1 Hit 2 Hit 3

Write – Through Write to the block in cache and in main memory 4

Write – Through Write to the block in cache and in main memory 4 -way associative example: 1. Read Valid and Tag to find the block.

Write – Through Write to the block in cache and in main memory 4

Write – Through Write to the block in cache and in main memory 4 -way associative example: 1. Read Valid and Tag to find the block. 2. If Hit, write word in block and write Main Memory, may have a Write Buffer

Write – Through Write to the block in cache and in main memory 4

Write – Through Write to the block in cache and in main memory 4 -way associative example: 1. Read Valid and Tag to find the block. 2. If Hit, write word in block and write Main Memory, may have a Write Buffer 3. If Miss, select a block to replace ( LRU or Random) and read block from Main Memory and Write to Cache. Then, write word in block and write Main Memory, may have a Write Buffer

Write – Back Also called Copy Back Write the word to the block in

Write – Back Also called Copy Back Write the word to the block in cache. Update main memory only when the block is replaced.

Write – Back Also called Copy Back Write the word to the block in

Write – Back Also called Copy Back Write the word to the block in cache. Update main memory only when the block is replaced. 4 -way associative example: 1. Read Valid and Tag to find the block.

Write – Back Also called Copy Back Write the word to the block in

Write – Back Also called Copy Back Write the word to the block in cache. Update main memory only when the block is replaced. 4 -way associative example: 1. Read Valid and Tag to find the block. 2. If Hit, write word in block and set “dirty bit”

Write – Back Also called Copy Back Write the word to the block in

Write – Back Also called Copy Back Write the word to the block in cache. Update main memory only when the block is replaced. 4 -way associative example: 1. Read Valid and Tag to find the block. 2. If Hit, write word in block and set “dirty bit” 3. If Miss, select a block to replace ( LRU or Random) and read block from Main Memory and Write to Cache and set “dirty bit”.

Write – Back Also called Copy Back Write the word to the block in

Write – Back Also called Copy Back Write the word to the block in cache. Update main memory only when the block is replaced. 4 -way associative example: 1. Read Valid and Tag to find the block. 2. If Hit, write word in block and set “dirty bit” 3. If Miss, select a block to replace ( LRU or Random) and read block from Main Memory and Write to Cache and set “dirty bit”. Before replacing a block on a Read Miss or Write Miss, if the dirty bit is set, write the block from Cache to Main Memory