Chapter 7 Memory 7 1 Computer Architecture and

Chapter 7 - Memory 7 -2 Chapter Contents 7. 1 The Memory Hierarchy 7.

Chapter 7 - Memory 7 -3 The Memory Hierarchy Computer Architecture and Organization by

Chapter 7 - Memory 7 -4 Functional Behavior of a RAM Cell Static RAM

Chapter 7 - Memory 7 -5 Simplified RAM Chip Pinout Computer Architecture and Organization

7 -6 Chapter 7 - Memory A Four-Word Memory with Four Bits per Word

Chapter 7 - Memory 7 -7 A Simplified Representation of the Four-Word by Four-Bit

Chapter 7 - Memory 7 -8 2 -1/2 D Organization of a 64 -Word

7 -9 Chapter 7 - Memory Two Four-Word by Four-Bit RAMs are Used in

7 -10 Chapter 7 - Memory Two Four-Word by Four-Bit RAMs Make up an

7 -11 Chapter 7 - Memory Single-In-Line Memory Module • 256 MB dual in-line

7 -12 Chapter 7 - Memory Single-In. Line Memory Module • Schematic diagram of

7 -13 Chapter 7 - Memory A ROM Stores Four-Bit Words Computer Architecture and

7 -14 Chapter 7 - Memory A Lookup Table (LUT) Implements an Eight-Bit ALU

Chapter 7 - Memory 7 -15 Flash Memory • (a) External view of flash

Chapter 7 - Memory 7 -16 Cell Structure for Flash Memory • Current flows

Chapter 7 - Memory 7 -17 Rambus Memory • Comparison of DRAM and RDRAM

Chapter 7 - Memory 7 -18 Rambus Memory • Rambus technology on the Nintendo

Chapter 7 - Memory 7 -19 Placement of Cache Memory in a Computer System

7 -20 Chapter 7 - Memory An Associative Mapping Scheme for a Cache Memory

Chapter 7 - Memory 7 -21 Associative Mapping Example • Consider how an access

7 -22 Chapter 7 - Memory Associative Mapping Area Allocation • Area allocation for

Chapter 7 - Memory 7 -23 Replacement Policies • When there are no available

7 -24 Chapter 7 - Memory A Direct Mapping Scheme for Cache Memory Computer

Chapter 7 - Memory 7 -25 Direct Mapping Example • For a direct mapped

Chapter 7 - Memory 7 -26 Direct Mapping Area Allocation • Area allocation for

7 -27 Chapter 7 - Memory A Set Associative Mapping Scheme for a Cache

Chapter 7 - Memory 7 -28 Set-Associative Mapping Example • Consider how an access

Chapter 7 - Memory 7 -29 Set Associative Mapping Area Allocation • Area allocation

Chapter 7 - Memory 7 -30 Cache Read and Write Policies Computer Architecture and

Chapter 7 - Memory 7 -31 Hit Ratios and Effective Access Times • Hit

Chapter 7 - Memory 7 -32 Direct Mapped Cache Example • Compute hit ratio

7 -33 Chapter 7 - Memory Table of Events for Example Program Computer Architecture

7 -34 Chapter 7 - Memory Calculation of Hit Ratio and Effective Access Time

Chapter 7 - Memory 7 -35 Multi-level Cache Memory As an example, consider a

Chapter 7 - Memory 7 -36 Multi-level Cache Memory (Cont’) H 2 is the

Chapter 7 - Memory 7 -37 Neat Little LRU Algorithm • A sequence is

Chapter 7 - Memory 7 -38 Cache Coherency • The goal of cache coherence

Chapter 7 - Memory 7 -39 Overlays • A partition graph for a program

Chapter 7 - Memory 7 -40 Virtual Memory • Virtual memory is stored in

Chapter 7 - Memory 7 -41 Page Table • The page table maps between

Chapter 7 - Memory 7 -42 Using the Page Table • A virtual address

7 -43 Chapter 7 - Memory Using the Page Table (cont’) • The configuration

Chapter 7 - Memory 7 -44 Segmentation • A segmented memory allows two users

Chapter 7 - Memory 7 -45 Fragmentation • (a) Free area of memory after

Chapter 7 - Memory 7 -46 Translation Lookaside Buffer • An example TLB holds

Chapter 7 - Memory 7 -47 Putting it All Together • An example TLB

Chapter 7 - Memory 7 -48 Content Addressable Memory – Addressing • Relationships between

Chapter 7 - Memory 7 -49 Overview of CAM • Source: (Foster, C. C.

Chapter 7 - Memory 7 -50 Addressing Subtrees for a CAM Computer Architecture and

Chapter 7 - Memory 7 -51 Associative Memory in Routers • A simple network

Chapter 7 - Memory 7 -52 Block Diagram of Dual-Read RAM • A dual-read

7 -53 Chapter 7 - Memory The Intel 4 Pentium Memory System Computer Architecture

Slides: 53

Download presentation

Chapter 7 - Memory 7 -1 Computer Architecture and Organization Miles Murdocca and Vincent Heuring Chapter 7 – Memory Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -2 Chapter Contents 7. 1 The Memory Hierarchy 7. 2 Random-Access Memory 7. 3 Memory Chip Organization 7. 4 Case Study: Rambus Memory 7. 5 Cache Memory 7. 6 Virtual Memory 7. 7 Advanced Topics 7. 8 Case Study: Associative Memory in Routers 7. 9 Case Study: The Intel Pentium 4 Memory System Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -4 Functional Behavior of a RAM Cell Static RAM cell (a) and dynamic RAM cell (b). Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

7 -9 Chapter 7 - Memory Two Four-Word by Four-Bit RAMs are Used in Creating a Four-Word by Eight -Bit RAM Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

7 -11 Chapter 7 - Memory Single-In-Line Memory Module • 256 MB dual in-line memory module organized for a 64 -bit word with 16 16 M × 8 -bit RAM chips (eight chips on each side of the DIMM). Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

7 -12 Chapter 7 - Memory Single-In. Line Memory Module • Schematic diagram of 256 MB dual in-line memory module. (Source: adapted from http: //wwws. ti. com/sc/ds/tm 4 en 64 kpu. pdf. ) Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -15 Flash Memory • (a) External view of flash memory module and (b) flash module internals. (Source: adapted from How. Stuff. Works. com. ) Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -16 Cell Structure for Flash Memory • Current flows from source to drain when a sufficient negative charge is placed on the dielectric material, preventing current flow through the word line. This is the logical 0 state. When the dielectric material is not charged, current flows between the bit and word lines, which is the logical 1 state. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -18 Rambus Memory • Rambus technology on the Nintendo 64 motherboard (left) enables cost savings over the conventional Sega Saturn motherboard design (right). • Nintendo 64 game console: Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -19 Placement of Cache Memory in a Computer System • The locality principle: a recently referenced memory location is likely to be referenced again (temporal locality); a neighbor of a recently referenced memory location is likely to be referenced (spatial locality). Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -21 Associative Mapping Example • Consider how an access to memory location (A 035 F 014)16 is mapped to the cache for a 232 word memory. The memory is divided into 227 blocks of 25 = 32 words per block, and the cache consists of 214 slots: • If the addressed word is in the cache, it will be found in word (14)16 of a slot that has tag (501 AF 80)16, which is made up of the 27 most significant bits of the address. If the addressed word is not in the cache, then the block corresponding to tag field (501 AF 80)16 is brought into an available slot in the cache from the main memory, and the memory reference is then satisfied from the cache. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

7 -22 Chapter 7 - Memory Associative Mapping Area Allocation • Area allocation for associative mapping scheme based on bits stored: Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -23 Replacement Policies • When there are no available slots in which to place a block, a replacement policy is implemented. The replacement policy governs the choice of which slot is freed up for the new block. • Replacement policies are used for associative and set-associative mapping schemes, and also for virtual memory. • Least recently used (LRU) • First-in/first-out (FIFO) • Least frequently used (LFU) • Random • Optimal (used for analysis only – look backward in time and reverseengineer the best possible strategy for a particular sequence of memory references. ) Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -25 Direct Mapping Example • For a direct mapped cache, each main memory block can be mapped to only one slot, but each slot can receive more than one block. Consider how an access to memory location (A 035 F 014)16 is mapped to the cache for a 232 word memory. The memory is divided into 227 blocks of 25 = 32 words per block, and the cache consists of 214 slots: • If the addressed word is in the cache, it will be found in word (14)16 of slot (2 F 80)16, which will have a tag of (1406)16. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -26 Direct Mapping Area Allocation • Area allocation for direct mapping scheme based on bits stored: Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -28 Set-Associative Mapping Example • Consider how an access to memory location (A 035 F 014)16 is mapped to the cache for a 232 word memory. The memory is divided into 227 blocks of 25 = 32 words per block, there are two blocks per set, and the cache consists of 214 slots: • The leftmost 14 bits form the tag field, followed by 13 bits for the set field, followed by five bits for the word field: Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -29 Set Associative Mapping Area Allocation • Area allocation for set associative mapping scheme based on bits stored: Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -31 Hit Ratios and Effective Access Times • Hit ratio and effective access time for single level cache: • Hit ratios and effective access time for multi-level cache: Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -32 Direct Mapped Cache Example • Compute hit ratio and effective access time for a program that executes from memory locations 48 to 95, and then loops 10 times from 15 to 31. • The direct mapped cache has four 16 -word slots, a hit time of 80 ns, and a miss time of 2500 ns. Load-through is used. The cache is initially empty. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -35 Multi-level Cache Memory As an example, consider a two-level cache in which the L 1 hit time is 5 ns, the L 2 hit time is 20 ns, and the L 2 miss time is 100 ns. There are 10, 000 memory references of which 10 cause L 2 misses and 90 cause L 1 misses. Compute the hit ratios of the L 1 and L 2 caches and the overall effective access time. H 1 is the ratio of the number of times the accessed word is in the L 1 cache to the total number of memory accesses. There a total of 85 (L 1) and 15 (L 2) misses, and so: (Continued on next slide. ) Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -36 Multi-level Cache Memory (Cont’) H 2 is the ratio of the number of times the accessed word is in the L 2 cache to the number of times the L 2 cache is accessed, and so: The effective access time is then: = 5. 23 ns per access Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -37 Neat Little LRU Algorithm • A sequence is shown for the Neat Little LRU Algorithm for a cache with four slots. Main memory blocks are accessed in the sequence: 0, 2, 3, 1, 5, 4. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -38 Cache Coherency • The goal of cache coherence is to ensure that every cache sees the same value for a referenced location, which means making sure that any shared operand that is changed is updated throughout the system. • This brings us to the issue of false sharing, which reduces cache performance when two operands that are not shared between processes share the same cache line. The situation is shown below. The problem is that each process will invalidate the other’s cache line when writing data without a real need, unless the compiler prevents this. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -39 Overlays • A partition graph for a program with a main routine and three subroutines: Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -40 Virtual Memory • Virtual memory is stored in a hard disk image. The physical memory holds a small number of virtual pages in physical page frames. • A mapping between a virtual and a physical memory: Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -41 Page Table • The page table maps between virtual memory and physical memory. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -42 Using the Page Table • A virtual address is translated into a physical address: Typical page table entry Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

7 -43 Chapter 7 - Memory Using the Page Table (cont’) • The configuration of a page table changes as a program executes. • Initially, the page table is empty. In the final configuration, four pages are in physical memory. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -44 Segmentation • A segmented memory allows two users to share the same word processor code, with different data spaces: Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -45 Fragmentation • (a) Free area of memory after initialization; (b) after fragmentation; (c) after coalescing. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -46 Translation Lookaside Buffer • An example TLB holds 8 entries for a system with 32 virtual pages and 16 page frames. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -47 Putting it All Together • An example TLB holds 8 entries for a system with 32 virtual pages and 16 page frames. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -48 Content Addressable Memory – Addressing • Relationships between random access memory and content addressable memory: Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -49 Overview of CAM • Source: (Foster, C. C. , Content Addressable Parallel Processors, Van Nostrand Reinhold Company, 1976. ) Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -51 Associative Memory in Routers • A simple network with three routers. • The use of associative memories in high-end routers reduces the lookup time by allowing a search to be performed in a single operation. • The search is based on the destination address, rather than the physical memory address. • Access methods for this memory have been standardized into an interface interoperability agreement by the Network Processing Forum. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring

Chapter 7 - Memory 7 -52 Block Diagram of Dual-Read RAM • A dual-read or dual-port RAM allows any two words to be simultaneously read from the same memory. Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring