Cache Coherence in Shared Memory Multiprocessors Chinmay Ashok
- Slides: 12
Cache Coherence in Shared Memory Multiprocessors Chinmay Ashok
An Introduction to The Problem Caches allow greater performance by storing frequently used data in faster memory (closer to the processor) n Since all processors share the same address space, it is possible for more than one processor to cache an item at a time n If one processor updates the data item without informing the other processors, inconsistencies may result and cause incorrect executions n
The Problem P 1 P 2 4 U: 5 $ U: 5 P 3 5 U: 5 3 U: 7 $ $ U: 5 2 U: 5 1 U: 5 MEMORY I/O devices
The Requirement To ensure that whenever a processor reads a memory location, it receives the correct value n Correctness implies that each read from a location should return the last value written to that location n The last value is produced by the latest write in program order n
Factors to Consider for Solutions For correct execution, coherence must be enforced between the caches n Two major factors are: n n performance implementation cost Four primary design issues are: n n coherence detection strategy – incoherent memory accesses coherence enforcement strategy – invalidate or update precision of block-sharing information – storage of sharing information cache block size – granularity
Types of Coherence Mechanisms Snoopy coherence mechanisms for bus-based multiprocessors (speed of the communication medium) n Directory based coherence mechanisms use a central directory to implement coherence n Compiler directed coherence mechanisms (static) let the compiler detect coherence issues and use special instructions to enforce coherence. n
Snoopy Protocols n Specified by n n Set of states associated with memory blocks in the local caches State transition diagram Actions associates with each state transition Examples n n n Valid-Invalid MSI MESI MOESI Dragon (Update based)
Valid – Invalid Protocol Assumption – Bus transactions are atomic Bus. Rd/Pr. Wr/Bus. Wr V Bus. Wr/- Pr. Rd/Bus. Rd Pr. Wr/Bus. Wr (Write Allocate) I Pr. Wr/Bus. Wr (Write No-Allocate) Bus. Rd/Bus. Wr/-
Memory Consistency For a shared address space this constrains the order in which memory operations must appear to be performed (visibility) n This includes operations to the same locations or to different locations, by the same process or by different processes. n Memory consistency subsumes coherence n
Three State MSI Write-Back Invalidation Protocol Assumption – Bus transactions are atomic Pr. Rd/- Pr. Wr/- M Pr. Wr/Bus. Rd. X Bus. Rd/ Flush Pr. Rd/Bus. Rd/- S Bus. Rd. X/ Flush Pr. Rd/Bus. Rd. X/- I Bus. Rd/Bus. Rd. X/-
Other Snoopy Protocols n MESI – Exclusive State n MOESI – Owned State n Dragon (Update Based) – M, E, Sm, Sc
Conclusion Cache coherence is necessary n Performance and implementation costs are critical factors while choosing a solution n Types – Snoopy, directory based and compiler directed n Memory consistency models n Standard snoopy protocols – Valid/Invalid, MSI, MESI, MOESI and Dragon n
- Virtual memory
- Shared vs distributed memory
- Cache coherence tutorial
- Cache coherence protocols
- Chained cache coherence protocol
- Gpu cache coherence
- Cache coherence example
- Read-through cache
- Characteristics of multiprocessor
- Interconnection structure of multiprocessor
- Uma multiprocessors using multistage switching networks
- Multiple processor systems
- Cache only memory architecture