Notes on Cache Comparison Problem A 450 MHz

Notes on: Cache Comparison Problem A 450 MHz Pentium with 32 KBytes L 1 cache, 128 MBytes RAM, and a 133 MHz system bus runs a program with an average working set size of 80 KBytes. While in a working set the program has a 0. 9997 probability that the next memory request will be from this working set and a 0. 9 probability that the next memory request will be the next instruction/data value in memory (i. e. 10% of the time a request is from a random memory address in the working set). (Note: when the program changes working sets, it will begin making memory requests from the new working set with 0. 9997 probability. ) (1) Determine how much (if any) performance improvement could be achieved by adding a 256 KByte L 2 (access speed= 450/2 MHz) to the processor. (2) Determine what size memory blocks should be moved between cache and RAM. (3) Give an outline of a memory caching strategy that makes sense. It is strongly recommended that you take the time to investigate the details of cache memory, especially the operation of the cache controller chip-set. As a guide, try to answer the following questions about the operation of the Pentium II/III and the related cache controller. When there is a cache miss, how many words of memory does the CPU need to be able to continue processing? When the cache memory is replaced, how many words are transferred? (assume 4 K). What hardware component is responsible for the transfer of blocks of memory to cache? Given the stats listed in the problem, what will happen to the hit-ratio while a new block of memory is being loaded into cache?

Flowchart for L 1 Only Cache Analysis start is memory access at next address is memory access in working set no yes tacc=tacc+L 1 acc num=num+1 yes no is memory access in cache is num>=numtotal yes tavg=tacc/num stop no simulates block read for 4 -way set associative L 1 cache tacc=tacc+block_size(. 75 L 1 acc+. 25 RAMacc) num=num+block_size

Flowchart for L 1/L 2 Cache Analysis start is memory access at next address is memory access in working set no yes which cache L 1 L 2 tacc=tacc+L 1 acc no simulates block read for 4 -way set associative L 2 cache tacc=tacc+block_size(. 75 L 2 acc+. 25 RAMacc) num=num+block_size num=num+1 is num>=numtotal yes tavg=tacc/num stop You may assume a write through operation between L 1 and L 2 cache (i. e. no additional delay)
- Slides: 3