10 Cache Memory kuic kyonggi ac krdssung 10

  • Slides: 77
Download presentation
제 10 장 Cache Memory

제 10 장 Cache Memory

kuic. kyonggi. ac. kr/~dssung 10. 0 Cache = Cache Memory Cache는 Fast Memory이며 CPU와

kuic. kyonggi. ac. kr/~dssung 10. 0 Cache = Cache Memory Cache는 Fast Memory이며 CPU와 Memory사이에 위치 Memory내 일부 정보를 가지고 있다 CPU Memory CPU Cache SRAM Memory DRAM

kuic. kyonggi. ac. kr/~dssung 왜, Cache를 사용하면 프로그램의 속도가 빨라 지는가 ? -> 늦어지는

kuic. kyonggi. ac. kr/~dssung 왜, Cache를 사용하면 프로그램의 속도가 빨라 지는가 ? -> 늦어지는 경우도 있다. Memory Access Time = 100 nsec Cache Access Time = 25 nsec라 가정하면, CPU Cache Memory Access Time = 100 nsec + 25 nsec = 125 nsec

kuic. kyonggi. ac. kr/~dssung 10. 1 Principle of Locality (1). Temporal Locality (Locality in

kuic. kyonggi. ac. kr/~dssung 10. 1 Principle of Locality (1). Temporal Locality (Locality in Time) -> If an item is referenced, it will tend to be referenced again soon -> ex) main(){ int i, sum, mul; sum = 0; mul = 0; for(i=1; i<=10; i++){ sum = sum + i; mul = mul * i; } printf(“sum is %d n”, sum); printf(“mul is %d n”, mul); }

kuic. kyonggi. ac. kr/~dssung Temporal Locality의 경우 average access time이 빨라지는 예 -> 동일한

kuic. kyonggi. ac. kr/~dssung Temporal Locality의 경우 average access time이 빨라지는 예 -> 동일한 정보를 10번 읽는 경우 Memory Access Time = 100 nsec Cache Access Time = 25 nsec라 가정하면, CPU Cache Memory Total Access Time = 100 nsec * 10 Average Access Time = 100 nsec Memory Total Access Time = 100 nsec + 25 nsec*10 = 350 nsec Average Access Time = 35 nsec

kuic. kyonggi. ac. kr/~dssung (2). Spatial Locality (Locality in Space) -> If an item

kuic. kyonggi. ac. kr/~dssung (2). Spatial Locality (Locality in Space) -> If an item is referenced, nearby items will tend to be referenced soon -> ex) main(){ int i, sum, mul; sum = 0; mul = 0; for(i=1; i<=10; i++){ sum = sum + i; mul = mul * i; } printf(“sum is %d n”, sum); printf(“mul is %d n”, mul); }

kuic. kyonggi. ac. kr/~dssung Spatial Locality 의 경우 average access time이 빨라지는 예 ->

kuic. kyonggi. ac. kr/~dssung Spatial Locality 의 경우 average access time이 빨라지는 예 -> 이웃 정보를 4개를 읽는 경우 (Memory->Cache: fast page mode) Memory Access Time = 100 nsec Cache Access Time = 25 nsec라 가정하면, CPU Cache Memory Total Access Time = 100 nsec * 4 Average Access Time = 100 nsec Memory Total Access Time = 100 nsec + 50 nsec*3 + 25 nsec*4 = 350 nsec Average Access Time = 87. 5 nsec

kuic. kyonggi. ac. kr/~dssung 10. 2 Hit Ratio Hit is a memory access in

kuic. kyonggi. ac. kr/~dssung 10. 2 Hit Ratio Hit is a memory access in cache, while miss means it is not found in cache. Hit Miss CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung Hit ratio is the fraction of memory accesses found in

kuic. kyonggi. ac. kr/~dssung Hit ratio is the fraction of memory accesses found in the cache. Hit ratio = {(Hit number)/(Try number)}*100 CPU Memory Hit ratio = 0 CPU Cache Memory 0 < Hit ratio < 100 Hit ratio = 100

kuic. kyonggi. ac. kr/~dssung 10. 3 Block Size in Cache Block : miss가 발생

kuic. kyonggi. ac. kr/~dssung 10. 3 Block Size in Cache Block : miss가 발생 시 memory에서 cache로 복사하는 정보 단위 Cache Memory Block의 크기가 크면 cache내 block의 개수가 적어지며 block의 크기가 작으면 cache내 block의 개수가 많아진다. ex) cache size = 256 byte block size : 16 byte -> cache내 block의 수 = 16개 block size : 32 byte -> cache내 block의 수 = 8개

kuic. kyonggi. ac. kr/~dssung Block의 size는 hit ratio를 높이는 방향으로 결정되어야 한다. 매우 어려운

kuic. kyonggi. ac. kr/~dssung Block의 size는 hit ratio를 높이는 방향으로 결정되어야 한다. 매우 어려운 문제 -> block의 size가 작아지면 block의 개수가 많아지게 된다. Temporal Locality : good Spatial Locality : bad -> block의 size가 커지면 block의 개수가 적어지게 된다. Temporal Locality : bad Spatial Locality : good

kuic. kyonggi. ac. kr/~dssung Cache = Set of Blocks Cache size is 256 byte

kuic. kyonggi. ac. kr/~dssung Cache = Set of Blocks Cache size is 256 byte If block size is 16 byte -> Cache = 16 blocks If block size is 32 byte -> Cache = 8 blocks Block = Set of Entries Block size is 16 byte If 8 bit CPU -> Block = 16 entries If 16 bit CPU -> Block = 8 entries Entry = Set of Bytes 8 bit CPU -> Entry = 1 Byte 16 bit CPU -> Entry = 2 Bytes 32 bit CPU -> Entry = 4 Bytes

kuic. kyonggi. ac. kr/~dssung ex) 8 bit CPU, Cache size = 256 bytes, Block

kuic. kyonggi. ac. kr/~dssung ex) 8 bit CPU, Cache size = 256 bytes, Block size = 16 bytes Cache = 16 Blocks, Block = 16 Entries, Entry = 1 Bytes C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F)

kuic. kyonggi. ac. kr/~dssung 10. 4 Block Placement CPU Cache Memory miss가 발생시 memory에서

kuic. kyonggi. ac. kr/~dssung 10. 4 Block Placement CPU Cache Memory miss가 발생시 memory에서 해당되는 block을 cache로 가지고 와야 하며, 이때, 어디에 둘 것인가 ?

kuic. kyonggi. ac. kr/~dssung A. Direct Mapped : = One-way set associative Each block

kuic. kyonggi. ac. kr/~dssung A. Direct Mapped : = One-way set associative Each block has only one place it can appear in the cache Cache Memory

kuic. kyonggi. ac. kr/~dssung B. Fully Associative : A block can be placed anywhere

kuic. kyonggi. ac. kr/~dssung B. Fully Associative : A block can be placed anywhere in the cache Cache Memory Cache Memory

kuic. kyonggi. ac. kr/~dssung C. Set Associative : A block can be placed in

kuic. kyonggi. ac. kr/~dssung C. Set Associative : A block can be placed in a restricted set of places in the cache Cache Memory

kuic. kyonggi. ac. kr/~dssung If there are n blocks in a set, the cache

kuic. kyonggi. ac. kr/~dssung If there are n blocks in a set, the cache placement is called n-way set associative for example) Cache내에 4개의 block가 있는 경우 Cache 2 -way set associative Cache 4 -way set associative = Fully associative Cache 1 -way set associative = Direct mapped

kuic. kyonggi. ac. kr/~dssung ex) Main Memory => 64 Kbyte (0 x 0000 -

kuic. kyonggi. ac. kr/~dssung ex) Main Memory => 64 Kbyte (0 x 0000 - 0 x. FFFF) M block FFF (FFF 0 -FFFF) M block FFE (FFE 0 -FFEF) … … … M block 01 F (01 F 0 -01 FF) … … M block 002 (0020 -002 F) M block 001 (0010 -001 F) M block 000 (0000 -000 F)

kuic. kyonggi. ac. kr/~dssung Memory Block 01 F A. Direct Mapped Method => Cache

kuic. kyonggi. ac. kr/~dssung Memory Block 01 F A. Direct Mapped Method => Cache Block F 만 Placement C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) M block FFF (FFF 0 -FFFF) M block FFE (FFE 0 -FFEF) … … … M block 01 F (01 F 0 -01 FF) … … M block 002 (0020 -002 F) M block 001 (0010 -001 F) M block 000 (0000 -000 F)

kuic. kyonggi. ac. kr/~dssung Memory Block 01 F 0000 0001 1111 Cache Set의 개수

kuic. kyonggi. ac. kr/~dssung Memory Block 01 F 0000 0001 1111 Cache Set의 개수 = Cache Block의 개수 16개 = 4 bit로 표현 선택되는 Cache Set = Cache Block -> Memory Block의 하위 4 bit에 의하여 선택 0000 0001 1111 -> Memory Block 01 F는 Cache Block F로 Placement

kuic. kyonggi. ac. kr/~dssung CPU가 Memory 01 F 8번지 정보를 원하는 경우 이 정보가

kuic. kyonggi. ac. kr/~dssung CPU가 Memory 01 F 8번지 정보를 원하는 경우 이 정보가 Cache에 없다면 (Miss) 이 정보가 속해 있는 Memory Block을 해당되는 Cache Block에 복사해야 한다. 01 F 8번지가 속한 Memory Block 은 01 F Direct Mapped Method에 의하여 Cache Block F에 Placement 했다면 -> Memory Block 01 F는 Cache Block F에 복사 되었음을 의미

kuic. kyonggi. ac. kr/~dssung B. Full Associative Method => Cache Block 0 - F

kuic. kyonggi. ac. kr/~dssung B. Full Associative Method => Cache Block 0 - F 어느 곳이나 Placement C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) M block FFF (FFF 0 -FFFF) M block FFE (FFE 0 -FFEF) … … … M block 01 F (01 F 0 -01 FF) … … M block 002 (0020 -002 F) M block 001 (0010 -001 F) M block 000 (0000 -000 F)

kuic. kyonggi. ac. kr/~dssung C. 2 -Way Set Associative Method Set 7 Set 6

kuic. kyonggi. ac. kr/~dssung C. 2 -Way Set Associative Method Set 7 Set 6 Set 5 Set 4 Set 3 Set 2 Set 1 Set 0 C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F)

kuic. kyonggi. ac. kr/~dssung 2 -Way Set Associative Method Set 7 Set 6 Set

kuic. kyonggi. ac. kr/~dssung 2 -Way Set Associative Method Set 7 Set 6 Set 5 Set 4 Set 3 Set 2 Set 1 Set 0 C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) M block FFF (FFF 0 -FFFF) M block FFE (FFE 0 -FFEF) … … … M block 01 F (01 F 0 -01 FF) … … M block 002 (0020 -002 F) M block 001 (0010 -001 F) M block 000 (0000 -000 F)

kuic. kyonggi. ac. kr/~dssung Memory Block 01 F 0000 0001 1111 Cache Set의 개수

kuic. kyonggi. ac. kr/~dssung Memory Block 01 F 0000 0001 1111 Cache Set의 개수 8개 = 3 bit로 표현 선택되는 Cache Set -> Memory Block의 하위 3 bit에 의하여 선택 0000 0001 1111 -> Memory Block 01 F는 Cache Set 7로 Placement

kuic. kyonggi. ac. kr/~dssung D. 4 -Way Set Associative Method Set 3 Set 2

kuic. kyonggi. ac. kr/~dssung D. 4 -Way Set Associative Method Set 3 Set 2 Set 1 Set 0 C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F)

kuic. kyonggi. ac. kr/~dssung 4 -Way Set Associative Method Set 3 Set 2 Set

kuic. kyonggi. ac. kr/~dssung 4 -Way Set Associative Method Set 3 Set 2 Set 1 Set 0 C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) M block FFF (FFF 0 -FFFF) M block FFE (FFE 0 -FFEF) … … … M block 01 F (01 F 0 -01 FF) … … M block 002 (0020 -002 F) M block 001 (0010 -001 F) M block 000 (0000 -000 F)

kuic. kyonggi. ac. kr/~dssung Memory Block 01 F 0000 0001 1111 Cache Set의 개수

kuic. kyonggi. ac. kr/~dssung Memory Block 01 F 0000 0001 1111 Cache Set의 개수 4개 = 2 bit로 표현 선택되는 Cache Set -> Memory Block의 하위 2 bit에 의하여 선택 0000 0001 1111 -> Memory Block 01 F는 Cache Set 3으로 Placement

kuic. kyonggi. ac. kr/~dssung 10. 5 Block Identification CPU가 원하는 정보가 Cache에 있는지를 어떠한

kuic. kyonggi. ac. kr/~dssung 10. 5 Block Identification CPU가 원하는 정보가 Cache에 있는지를 어떠한 방법으로 확인할 것인가 ? Hit Miss CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung Full Associative Method => Cache Block 0 - F 어느

kuic. kyonggi. ac. kr/~dssung Full Associative Method => Cache Block 0 - F 어느 곳이나 Placement C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) M block FFF (FFF 0 -FFFF) M block FFE (FFE 0 -FFEF) … … … M block 01 F (01 F 0 -01 FF) … … M block 002 (0020 -002 F) M block 001 (0010 -001 F) M block 000 (0000 -000 F)

kuic. kyonggi. ac. kr/~dssung CPU가 원하는 정보가 Cache에 있는지를 어떠한 방법으로 확인할 것인가 ?

kuic. kyonggi. ac. kr/~dssung CPU가 원하는 정보가 Cache에 있는지를 어떠한 방법으로 확인할 것인가 ? Cache내 Tag를 이용 Tag … … 0 x 01 F … … … C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) M block FFF (FFF 0 -FFFF) M block FFE (FFE 0 -FFEF) … … … M block 01 F (01 F 0 -01 FF) … … M block 002 (0020 -002 F) M block 001 (0010 -001 F) M block 000 (0000 -000 F) Cache Tag를 통하여 Cache Block B에 Memory Block 01 F가 있음을 알 수 있다.

kuic. kyonggi. ac. kr/~dssung Full Associative Method => Cache Block 0 - F 어느

kuic. kyonggi. ac. kr/~dssung Full Associative Method => Cache Block 0 - F 어느 곳이나 Placement CPU는 원하는 정보가 Cache에 있는지 확인하기 위하여 최악의 경우 모든 Tag를 확인해야 한다.

kuic. kyonggi. ac. kr/~dssung Direct Mapped Method => Cache Block F 만 Placement Tag

kuic. kyonggi. ac. kr/~dssung Direct Mapped Method => Cache Block F 만 Placement Tag 0 x 01 F … … … … C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) M block FFF (FFF 0 -FFFF) M block FFE (FFE 0 -FFEF) … … … M block 01 F (01 F 0 -01 FF) … … M block 002 (0020 -002 F) M block 001 (0010 -001 F) M block 000 (0000 -000 F) Cache Tag를 통하여 Cache Block F에 Memory Block 01 F가 있음을 알 수 있다.

kuic. kyonggi. ac. kr/~dssung Direct Mapped Method => Cache Block F 만 Placement CPU는

kuic. kyonggi. ac. kr/~dssung Direct Mapped Method => Cache Block F 만 Placement CPU는 원하는 정보가 Cache에 있는지 확인하기 위하여 특정 Cache Block의 Tag만 검사하면 된다. ex) CPU가 원하는 정보 0 x 01 F 8번지의 정보 -> Memory Block 0 x 01 F이며 Cache Block F에 만 placement가능 따라서 Cache Block F의 Tag만 검사하면 된다. 이 방법의 경우, Tag의 길이는 다음과 같이 줄어 들 수 있다.

kuic. kyonggi. ac. kr/~dssung 0 x 01 … … … … C block F

kuic. kyonggi. ac. kr/~dssung 0 x 01 … … … … C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) M block FFF (FFF 0 -FFFF) M block FFE (FFE 0 -FFEF) … … … M block 01 F (01 F 0 -01 FF) … … M block 002 (0020 -002 F) M block 001 (0010 -001 F) M block 000 (0000 -000 F)

kuic. kyonggi. ac. kr/~dssung Memory Block 0 x 01 F 0000 0001 1111 Cache

kuic. kyonggi. ac. kr/~dssung Memory Block 0 x 01 F 0000 0001 1111 Cache Set의 개수 = Cache Block의 개수 16개 = 4 bit로 표현 선택되는 Cache Set = Cache Block -> Memory Block의 하위 4 bit에 의하여 선택 0000 0001 1111 (index field) -> Memory Block 0 x 01 F는 Cache Block F로 Placement 이 경우, 하위 4 bit를 제외한 나머지 부분만 Tag로 하면 된다. -> 0000 0001 1111 (Tag field)

kuic. kyonggi. ac. kr/~dssung CPU Address => [ Tag field ][ Index field ][

kuic. kyonggi. ac. kr/~dssung CPU Address => [ Tag field ][ Index field ][ Block offset field ] = [ Memory block number ][ Block offset field ] ex) CPU Address = 0 x 01 F 8 Direct Mapped Method에서 Memory block number = 0 x 01 F Block offset field = 0 x 8 0 x 01 F 8 Tag field = 0 x 01 Index field = 0 x. F Block offset field = 0 x 8 0 x 01 F 8

kuic. kyonggi. ac. kr/~dssung [Tag field] => Comparison with the tag in the cache

kuic. kyonggi. ac. kr/~dssung [Tag field] => Comparison with the tag in the cache [Index field] => Select the set [Block offset field] => Select the desired data

kuic. kyonggi. ac. kr/~dssung A. Full Associative Method (m-way set associative) ex) CPU Address

kuic. kyonggi. ac. kr/~dssung A. Full Associative Method (m-way set associative) ex) CPU Address = 0 x 01 F 8 Tag field = 0 x 01 F Index field = Block offset field = 0 x 8 0 x 01 F 8 Memory Block Number = 0 x 01 F 0000 0001 1111 Cache Set의 개수 = 1개 = 0 bit로 표현 => index field is null 따라서 Tag field = 0000 0001 1111 = 0 x 01 F

kuic. kyonggi. ac. kr/~dssung B. Direct Mapped Method (1 -way set associative) ex) CPU

kuic. kyonggi. ac. kr/~dssung B. Direct Mapped Method (1 -way set associative) ex) CPU Address = 0 x 01 F 8 Tag field = 0 x 01 Index field = 0 x. F Block offset field = 0 x 8 Memory Block Number = 0 x 01 F 0000 0001 1111 Cache Set의 개수 = 16개 = 4 bit로 표현 => index field is 4 bit => 1111 = 0 x. F 따라서 Tag field = 0000 0001 = 0 x 01 F 8

kuic. kyonggi. ac. kr/~dssung C. 2 -way set Associative Method ex) CPU Address =

kuic. kyonggi. ac. kr/~dssung C. 2 -way set Associative Method ex) CPU Address = 0 x 01 F 8 = 0000 0001 1111 1000 Tag field = 0000 0001 1 Index field = 111 Block offset field = 1000 Memory Block Number = 0 x 01 F 0000 0001 1111 Cache Set의 개수 = 8개 = 3 bit로 표현 => index field is 3 bit => 111 따라서 Tag field = 0000 0001 1111 1000

kuic. kyonggi. ac. kr/~dssung D. 4 -way set Associative Method ex) CPU Address =

kuic. kyonggi. ac. kr/~dssung D. 4 -way set Associative Method ex) CPU Address = 0 x 01 F 8 = 0000 0001 1111 1000 Tag field = 0000 0001 11 Index field = 11 Block offset field = 1000 0000 0001 1111 1000 Memory Block Number = 0 x 01 F 0000 0001 1111 Cache Set의 개수 = 4개 = 2 bit로 표현 => index field is 2 bit => 11 따라서 Tag field = 0000 0001 11

kuic. kyonggi. ac. kr/~dssung E. 8 -way set Associative Method ex) CPU Address =

kuic. kyonggi. ac. kr/~dssung E. 8 -way set Associative Method ex) CPU Address = 0 x 01 F 8 = 0000 0001 1111 1000 Tag field = 0000 0001 111 Index field = 1 Block offset field = 1000 0000 0001 1111 1000 Memory Block Number = 0 x 01 F 0000 0001 1111 Cache Set의 개수 = 2개 = 1 bit로 표현 => index field is 1 bit => 1 따라서 Tag field = 0000 0001 111

kuic. kyonggi. ac. kr/~dssung - The tag/index boundary moves to the right with increasing

kuic. kyonggi. ac. kr/~dssung - The tag/index boundary moves to the right with increasing associativity. Direct Mapped Method (1 -way set associative) 0000 0001 1111 2 -way set Associative Method 0000 0001 1111 4 -way set Associative Method 0000 0001 1111 8 -way set Associative Method 0000 0001 1111 Full Associative Method (m-way set associative) 0000 0001 1111

kuic. kyonggi. ac. kr/~dssung - Caches include an address tag on each block -

kuic. kyonggi. ac. kr/~dssung - Caches include an address tag on each block - Cache tag address is checked to see if it matches the tag field of CPU address - Because speed is of the essence, all possible tags are searched in parallel - Valid bit in the tag : Notify whether or not this entry contains a valid address

kuic. kyonggi. ac. kr/~dssung A. Full Associative Method (m-way set associative) ex) CPU Address

kuic. kyonggi. ac. kr/~dssung A. Full Associative Method (m-way set associative) ex) CPU Address = 0 x 01 F 8 Tag field = 0 x 01 F Index field = Block offset field = 0 x 8 0 x 01 F 8 Memory Block Number = 0 x 01 F 0000 0001 1111 Cache Set의 개수 = 1개 = 0 bit로 표현 => index field is null 따라서 Tag field = 0000 0001 1111 = 0 x 01 F

kuic. kyonggi. ac. kr/~dssung Full Associative Method Valid bit Tag 0 x 01 F

kuic. kyonggi. ac. kr/~dssung Full Associative Method Valid bit Tag 0 x 01 F 8 CPU 1 0 0 1 1 1 1 1 0 x 234 0 x 000 0 x 01 C 0 x 01 E 0 x 553 0 x 000 0 x 766 0 x 023 0 x 011 0 x 126 0 x 123 0 x 222 0 x 111 0 x 01 F C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F)

kuic. kyonggi. ac. kr/~dssung B. Direct Mapped Method (1 -way set associative) ex) CPU

kuic. kyonggi. ac. kr/~dssung B. Direct Mapped Method (1 -way set associative) ex) CPU Address = 0 x 01 F 8 Tag field = 0 x 01 Index field = 0 x. F Block offset field = 0 x 8 Memory Block Number = 0 x 01 F 0000 0001 1111 Cache Set의 개수 = 16개 = 4 bit로 표현 => index field is 4 bit => 1111 = 0 x. F 따라서 Tag field = 0000 0001 = 0 x 01 F 8

kuic. kyonggi. ac. kr/~dssung Direct Mapped Method Valid bit Tag 0 x 01 F

kuic. kyonggi. ac. kr/~dssung Direct Mapped Method Valid bit Tag 0 x 01 F 8 CPU 1 0 0 1 1 1 1 1 0 x 01 0 x 00 0 x 01 0 x 55 0 x 00 0 x 76 0 x 02 0 x 01 0 x 12 0 x 22 0 x 11 0 x 01 C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F)

kuic. kyonggi. ac. kr/~dssung C. 2 -way set Associative Method ex) CPU Address =

kuic. kyonggi. ac. kr/~dssung C. 2 -way set Associative Method ex) CPU Address = 0 x 01 F 8 = 0000 0001 1111 1000 Tag field = 0000 0001 1 Index field = 111 Block offset field = 1000 Memory Block Number = 0 x 01 F 0000 0001 1111 Cache Set의 개수 = 8개 = 3 bit로 표현 => index field is 3 bit => 111 따라서 Tag field = 0000 0001 1111 1000

kuic. kyonggi. ac. kr/~dssung 2 -way set Associative Method Valid bit Tag 1 1

kuic. kyonggi. ac. kr/~dssung 2 -way set Associative Method Valid bit Tag 1 1 0 1 1 1 0 0 1 1 1 1 0000000111 000000111 000010011 000000000 00100001110011 111000011 000010011 011100011 000011111 000000111 0 x 01 F 8 CPU 0000 0001 1111 1000 C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F)

kuic. kyonggi. ac. kr/~dssung 10. 6 Block Replacement CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung 10. 6 Block Replacement CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung Direct Mapped Placement => Simple CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung Direct Mapped Placement => Simple CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung Fully or Set Associative Placement a. Random b. LRU(Least Recently

kuic. kyonggi. ac. kr/~dssung Fully or Set Associative Placement a. Random b. LRU(Least Recently Used) CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung 10. 7 Read Strategy Parallel Tags Matching Parallel Reading of

kuic. kyonggi. ac. kr/~dssung 10. 7 Read Strategy Parallel Tags Matching Parallel Reading of Cache and Memory

kuic. kyonggi. ac. kr/~dssung Full Associative Method Tag Parallel Tags Matching 0 x 01

kuic. kyonggi. ac. kr/~dssung Full Associative Method Tag Parallel Tags Matching 0 x 01 F 8 CPU 0 x 234 0 x 987 0 x. ABC 0 x 01 E 0 x 553 0 x 922 0 x 812 0 x 766 0 x 023 0 x 011 0 x 126 0 x 123 0 x 222 0 x 111 0 x 01 F C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F)

kuic. kyonggi. ac. kr/~dssung Tag 0 x 01 F 8 CPU 0 x 234

kuic. kyonggi. ac. kr/~dssung Tag 0 x 01 F 8 CPU 0 x 234 0 x 987 0 x. ABC 0 x 01 E 0 x 553 0 x 922 0 x 812 0 x 766 0 x 023 0 x 011 0 x 126 0 x 123 0 x 222 0 x 111 0 x 01 F C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) Parallel Reading of Cache and Memory

kuic. kyonggi. ac. kr/~dssung Tag hit 0 x 01 F 8 CPU 0 x

kuic. kyonggi. ac. kr/~dssung Tag hit 0 x 01 F 8 CPU 0 x 234 0 x 987 0 x. ABC 0 x 01 E 0 x 553 0 x 922 0 x 812 0 x 766 0 x 023 0 x 011 0 x 126 0 x 123 0 x 222 0 x 111 0 x 01 F C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) Parallel Reading of Cache and Memory

kuic. kyonggi. ac. kr/~dssung Tag miss 0 x 01 F 8 CPU 0 x

kuic. kyonggi. ac. kr/~dssung Tag miss 0 x 01 F 8 CPU 0 x 234 0 x 987 0 x. ABC 0 x 01 E 0 x 553 0 x 922 0 x 812 0 x 766 0 x 023 0 x 011 0 x 126 0 x 123 0 x 222 0 x 111 0 x 01 A C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) Parallel Reading of Cache and Memory

kuic. kyonggi. ac. kr/~dssung miss 0 x 01 F 8 CPU 0 x 234

kuic. kyonggi. ac. kr/~dssung miss 0 x 01 F 8 CPU 0 x 234 0 x 987 0 x. ABC 0 x 01 E 0 x 553 0 x 922 0 x 812 0 x 01 F 0 x 023 0 x 011 0 x 126 0 x 123 0 x 222 0 x 111 0 x 01 A C block F (F 0 -FF) C block E (E 0 -EF) C block D (D 0 -DF) C block C (C 0 -CF) C block B (B 0 -BF) C block A (A 0 -AF) C block 9 (90 -9 F) C block 8 (80 -8 F) C block 7 (70 -7 F) C block 6 (60 -6 F) C block 5 (50 -5 F) C block 4 (40 -4 F) C block 3 (30 -3 F) C block 2 (20 -2 F) C block 1 (10 -1 F) C block 0 (00 -0 F) Parallel Reading of Cache and Memory

kuic. kyonggi. ac. kr/~dssung Program Instruction = Code Op code field + Operand field

kuic. kyonggi. ac. kr/~dssung Program Instruction = Code Op code field + Operand field Read Only Data Operand Read/Write ex) INC $1000 <- Instruction $1000번지 내용 <- Data

kuic. kyonggi. ac. kr/~dssung Cache Code Cache - Code 정보만 Cache - Read Only

kuic. kyonggi. ac. kr/~dssung Cache Code Cache - Code 정보만 Cache - Read Only - Read Hit, Read Miss Data Cache - Data 정보만 Cache - Read/Write : - Read Hit, Read Miss, Write Hit, Write Miss Code/Data Cache - Code와 Data정보 모두 Cache - Read/Write - Read Hit, Read Miss, Write Hit, Write Miss

kuic. kyonggi. ac. kr/~dssung 10. 8 Write Strategy (1). Write Hit -> Cache와 Memory내에

kuic. kyonggi. ac. kr/~dssung 10. 8 Write Strategy (1). Write Hit -> Cache와 Memory내에 있는 정보가 달라 질 수 있다. CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung a. Write Through -> The information is written to both

kuic. kyonggi. ac. kr/~dssung a. Write Through -> The information is written to both the block in the cache and the block in the lower level memory -> Advantage to keep the cache and memory consistent CPU Write hit Cache Memory

kuic. kyonggi. ac. kr/~dssung CPU Write hit Cache Memory

kuic. kyonggi. ac. kr/~dssung CPU Write hit Cache Memory

kuic. kyonggi. ac. kr/~dssung b. Write Back -> The information is written only to

kuic. kyonggi. ac. kr/~dssung b. Write Back -> The information is written only to the block in the cache. The modified cache block is written to main memory only when it is replaced. -> Use less memory bandwidth (Multiple write in a block => One write to low level memory)

kuic. kyonggi. ac. kr/~dssung Write hit Replacement CPU CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung Write hit Replacement CPU CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung Write hit Replacement Write hit CPU CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung Write hit Replacement Write hit CPU CPU Cache Memory

kuic. kyonggi. ac. kr/~dssung (2). Write Miss a. Write allocate : Prefetch on write

kuic. kyonggi. ac. kr/~dssung (2). Write Miss a. Write allocate : Prefetch on write -> The block is loaded into cache b. No write allocate : Write around -> The block is not loaded into cache

kuic. kyonggi. ac. kr/~dssung Write allocate CPU miss Cache write Cache read Memory CPU

kuic. kyonggi. ac. kr/~dssung Write allocate CPU miss Cache write Cache read Memory CPU Memory Cache

kuic. kyonggi. ac. kr/~dssung No write allocate CPU miss Cache Memory

kuic. kyonggi. ac. kr/~dssung No write allocate CPU miss Cache Memory

kuic. kyonggi. ac. kr/~dssung write-back caches generally use write allocate (hoping that subsequent writes

kuic. kyonggi. ac. kr/~dssung write-back caches generally use write allocate (hoping that subsequent writes to that block will be captured by the cache) write-through caches often use no-write allocate (since subsequent writes to that block will still have to go to memory)