Cache Memory and Performance Cache Organization 1 Many

  • Slides: 24
Download presentation
Cache Memory and Performance Cache Organization 1 Many of the following slides are taken

Cache Memory and Performance Cache Organization 1 Many of the following slides are taken with permission from Complete Powerpoint Lecture Notes for Computer Systems: A Programmer's Perspective (CS: APP) Randal E. Bryant and David R. O'Hallaron http: //csapp. cs. cmu. edu/public/lectures. html The book is used explicitly in CS 2505 and CS 3214 and as a reference in CS 2506. CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

Cache Memories Cache Organization 2 Cache memories are small, fast SRAM-based memories managed automatically

Cache Memories Cache Organization 2 Cache memories are small, fast SRAM-based memories managed automatically in hardware. – Hold frequently accessed blocks of main memory CPU looks first for data in caches (e. g. , L 1, L 2, and L 3), then in main memory. Typical system structure: CPU chip Register file Cache memories Bus interface CS@VT ALU System bus I/O bridge Computer Organization II Memory bus Main memory © 2005 -2013 CS: APP & Mc. Quain

General Cache Organization (S, E, B) Cache Organization 3 E = 2 e lines

General Cache Organization (S, E, B) Cache Organization 3 E = 2 e lines (blocks) per set 1 0 S = 2 s sets 2 e-1 0 1 2 3 set line (block) 2 s-1 v valid bit tag 0 1 2 B-1 B = 2 b bytes per cache block (the data) Cache size: C = S x E x B data bytes CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

Cache Defines View of DRAM Cache Organization 4 The "geometry" of the cache is

Cache Defines View of DRAM Cache Organization 4 The "geometry" of the cache is defined by: S = 2 s E = 2 e B = 2 b the number of sets in the cache the number of lines (blocks) in a set the number of bytes in a line (block) These values define a related way to think about the organization of DRAM: DRAM consists of a sequence of blocks of B bytes. The bytes in a block (line) can be indexed by using b bits. DRAM consists of a sequence of groups of S blocks (lines). The blocks (lines) in a group can be indexed by using s bits. Each group contains Sx. B bytes, which can be indexed by using s + b bits. CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

Cache (8, 2, 4) and 256 -Byte DRAM Cache size: C = S x

Cache (8, 2, 4) and 256 -Byte DRAM Cache size: C = S x E x B = 64 data bytes Cache Organization 5 E = 21 blocks (lines) per set S = 23 sets DRAM 1 0 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7. . . 252 v valid bit CS@VT tag 253 0 1 2 3 254 B = 22 bytes per cache block (the data) Computer Organization II 255 © 2005 -2013 CS: APP & Mc. Quain

Example of Cache View of DRAM Cache Organization 6 Assume a cache has the

Example of Cache View of DRAM Cache Organization 6 Assume a cache has the following geometry: DRAM 0000 the number of sets in the cache the number of lines (blocks) in a set the number of bytes in a line (block) block 0 S = 22 = 8 E = 21 = 2 B = 22 = 4 address 000000010 00000011 Suppose that DRAM consists of 256 bytes, so we have 8 -bit addresses. block 1 Then DRAM consists of: - 64 blocks, each holding 4 bytes - 8 groups, each holding 8 blocks 00000100 00000101 group 00000110 00000111. . . 00011100 block 7 00011101 00011110 00011111. . . . CS@VT Computer Organization II . . . © 2005 -2013 CS: APP & Mc. Quain

Example of Cache View of DRAM Pick an address: Cache Organization 7 address 01110101

Example of Cache View of DRAM Pick an address: Cache Organization 7 address 01110101 01 group block byte 01100000. . . . and the offset of the byte in the block. byte 00 01110001 byte 01 01110010 byte 10 01110011 byte 11 01110100 byte 00 01110101 byte 01 01110110 byte 10 0111 byte 11 01111000 byte 00 01111001 byte 01 01111010 byte 10 01111011 byte 11 Computer Organization II block 110 01110000 . . . . CS@VT . . . . block 101 group byte 00 block 100 a 1 a 0 give the byte number . . . . block 000 011 . . . . DRAM . . . . © 2005 -2013 CS: APP & Mc. Quain

Example of Cache View of DRAM Pick an address: Cache Organization 8 address 01110101

Example of Cache View of DRAM Pick an address: Cache Organization 8 address 01110101 01 group block byte 01100000. . . . * a 4 a 3 a 200 = a 4 a 3 a 2 x 22 = a 4 a 3 a 2 x (size of a block) CS@VT Computer Organization II 01110000 byte 00 01110001 byte 01 01110010 byte 10 01110011 byte 11 01110100 byte 00 01110101 byte 01 01110110 byte 10 0111 byte 11 01111000 byte 00 01111001 byte 01 01111010 byte 10 01111011 byte 11 . . . . block 110 a 4 a 3 a 200* equals the offset of the block in the group. . block 101 group byte 00 block 100 a 4 a 3 a 2 give the block number . . . . block 000 011 . . . . DRAM . . . . © 2005 -2013 CS: APP & Mc. Quain

Example of Cache View of DRAM Pick an address: Cache Organization 9 address 01110101

Example of Cache View of DRAM Pick an address: Cache Organization 9 address 01110101 01 group block byte 01100000. . . . byte 01 01110010 byte 10 01110011 byte 11 01110100 byte 00 01110101 byte 01 01110110 byte 10 0000 0111 byte 11 01100000 01111000 byte 00 01111001 byte 01 01111010 byte 10 01111011 byte 11 a 7 a 6 a 500000* equals the offset of the group in the DRAM. group * Why? . . . . Computer Organization II block 110 01110001 block 101 byte 00 block 100 01110000 a 7 a 6 a 5 give the group number CS@VT . . . . block 000 011 . . . . DRAM . . . . © 2005 -2013 CS: APP & Mc. Quain

The BIG Picture 011 101 01 Cache Organization 10 DRAM 0000 Group 011 in

The BIG Picture 011 101 01 Cache Organization 10 DRAM 0000 Group 011 in DRAM 01100000 00100000 011001000000 01101000 01100000 01101100 10000000 01110000 Block 101 in Group 01110100 byte 00 01110101 byte 01 01110110 byte 10 0111 byte 11 10100000 01110100 11000000 0111100000 01111100 CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

Example of Cache View of DRAM Pick an address: 01110101 Cache Organization 11 address.

Example of Cache View of DRAM Pick an address: 01110101 Cache Organization 11 address. . . . How does this address map into the cache? Each set in our cache can hold 2 blocks. This block could be stored at either location within the corresponding set. CS@VT Computer Organization II byte 00 01110001 byte 01 01110010 byte 10 01110011 byte 11 01110100 byte 00 01110101 byte 01 01110110 byte 10 0111 byte 11 01111000 byte 00 01111001 byte 01 01111010 byte 10 01111011 byte 11 . . . . block 110 So the address: 01110101 maps to set 101 in the cache. 01110000 block 101 Note this means that two DRAM blocks from the same DRAM group cannot map into the same cache set. . block 100 The DRAM block number determines the cache set used to store the block. DRAM . . . . © 2005 -2013 CS: APP & Mc. Quain

Example of Cache View of DRAM block containing address: 01110101 Cache Organization 12 address.

Example of Cache View of DRAM block containing address: 01110101 Cache Organization 12 address. . . . Maps to cache set: 01110101 DRAM. . . . 01110001 byte 01 1 01110010 byte 10 2 01110011 byte 11 3 01110100 byte 00 4 01110101 byte 01 01110110 byte 10 0111 byte 11 01111000 byte 00 01111001 byte 01 01111010 byte 10 01111011 byte 11 5 6 7 or 1 valid CS@VT 011 0 1 2 3 tag . . . . Computer Organization II block 110 0 block 101 byte 00 block 100 01110000 Cache . . . . © 2005 -2013 CS: APP & Mc. Quain

Cache View of DRAM Cache Organization 13 So, to generalize, suppose a cache has:

Cache View of DRAM Cache Organization 13 So, to generalize, suppose a cache has: S = 2 s sets E = 2 e blocks (lines) per set B = 2 b bytes per block (line) And, suppose that DRAM uses N-bit addresses. then for any address: a. N-1 … as+b-1 … ab ab-1 … a 0 Bits ab-1: a 0 give the byte index within the block Bits ab+s-1: ab give the set index Bits a. N-1: as+b become the tag for the data Note that these bits are only the same for blocks that are within the same DRAM group. CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

Cache Read 1. 2. 3. 4. Cache Organization 14 Locate set Check if any

Cache Read 1. 2. 3. 4. Cache Organization 14 Locate set Check if any line in set has matching tag Yes + line valid: hit Locate data starting at offset Set 0 1 1 Address of word: t bits tag s bits = K b bits = J set block index offset Line 1 0 2 e-1 K 2 s-1 2 K data begins at this offset 3 CS@VT 4 v tag Computer Organization II 0 1 J 2 b-1 © 2005 -2013 CS: APP & Mc. Quain

Example: Direct Mapped Cache (E = 1) Cache Organization 15 Direct mapped: One line

Example: Direct Mapped Cache (E = 1) Cache Organization 15 Direct mapped: One line per set Assume: cache block size 8 bytes v tag 0 1 2 3 4 5 6 7 Address of int: t bits 0… 01 100 find set S = 2 s sets CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

Example: Direct Mapped Cache (E = 1) Cache Organization 16 Direct mapped: One line

Example: Direct Mapped Cache (E = 1) Cache Organization 16 Direct mapped: One line per set Assume: cache block size 8 bytes valid? v + tag matching tags hit Address of int: t bits 0… 01 100 0 1 2 3 4 5 6 7 block offset CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

Example: Direct Mapped Cache (E = 1) Cache Organization 17 Direct mapped: One line

Example: Direct Mapped Cache (E = 1) Cache Organization 17 Direct mapped: One line per set Assume: cache block size 8 bytes valid? v + tag Address of int: matching tags hit t bits 0… 01 100 0 1 2 3 4 5 6 7 block offset int (4 Bytes) is here No match: old line (block) is evicted and replaced by requested block from DRAM CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

Direct-Mapped Cache Simulation t=1 x s=2 xx b=1 x Cache Organization 18 M=16 byte

Direct-Mapped Cache Simulation t=1 x s=2 xx b=1 x Cache Organization 18 M=16 byte addresses, B=2 bytes/block, S=4 sets, E=1 Blocks/set Address trace (reads, one byte per read): 0 [00002], miss 1 [00012], hit 7 [01112], miss 8 [10002], miss 0 [00002] miss Set 0 Set 1 Set 2 Set 3 CS@VT v 0 1 Tag 1? 0 Block ? M[8 -9] M[0 -1] 1 0 M[6 -7] Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

E-way Set Associative Cache (Here: E = 2) Cache Organization 19 E = 2:

E-way Set Associative Cache (Here: E = 2) Cache Organization 19 E = 2: Two lines per set Assume: cache block size 8 bytes Address of short int: t bits v tag 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7 CS@VT Computer Organization II 0… 01 100 find set © 2005 -2013 CS: APP & Mc. Quain

E-way Set Associative Cache (Here: E = 2) E = 2: Two lines per

E-way Set Associative Cache (Here: E = 2) E = 2: Two lines per set Assume: cache block size 8 bytes Address of short int: t bits compare both valid? v + tag Cache Organization 20 0… 01 100 matching tags hit 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7 block offset CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

E-way Set Associative Cache (Here: E = 2) Cache Organization 21 E = 2:

E-way Set Associative Cache (Here: E = 2) Cache Organization 21 E = 2: Two lines per set Assume: cache block size 8 bytes Address of short int: t bits compare both valid? v + tag 0… 01 100 matching tags hit 0 1 2 3 4 5 6 7 v tag 0 1 2 3 4 5 6 7 block offset short int (2 Bytes) is here No match: • One line in set is selected for eviction and replacement • Replacement policies: random, least recently used (LRU), … CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

2 -Way Set Associative Cache Simulation t=2 xx s=1 x b=1 x Cache Organization

2 -Way Set Associative Cache Simulation t=2 xx s=1 x b=1 x Cache Organization 22 M=16 byte addresses, B=2 bytes/block, S=2 sets, E=2 blocks/set Address trace (reads, one byte per read): miss 0 [00002], hit 1 [00012], miss 7 [01112], miss 8 [10002], hit 0 [00002] Set 0 Set 1 CS@VT v 0 1 Tag ? 00 10 Block ? M[0 -1] M[8 -9] 0 1 0 01 M[6 -7] Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

Cache Organization Types Cache Organization 23 The "geometry" of the cache is defined by:

Cache Organization Types Cache Organization 23 The "geometry" of the cache is defined by: S = 2 s E = 2 e B = 2 b the number of sets in the cache the number of lines (blocks) in a set the number of bytes in a line (block) E = 1 (e = 0) S>1 E=K>1 S = 1 (only one set) E = # of cache blocks CS@VT direct-mapped cache only one possible location in cache for each DRAM block K-way associative cache K possible locations (in same cache set) for each DRAM block fully-associative cache each DRAM block can be at any location in the cache Computer Organization II © 2005 -2013 CS: APP & Mc. Quain

Searching a Set Cache Organization 24 If we have an associative cache (K-way or

Searching a Set Cache Organization 24 If we have an associative cache (K-way or fully), how do we determine if a given DRAM block occurs within a set? Compare the tag we’re trying to match to all of the tags for blocks in the relevant set at the same time! Then factor in the valid bits, also in parallel. And employ a MUX CS@VT Computer Organization II © 2005 -2013 CS: APP & Mc. Quain