Multiprocessor Highlights MESI Cache Coherence Protocol Memory Consistency

  • Slides: 4
Download presentation
Multiprocessor Highlights MESI Cache Coherence Protocol, Memory Consistency, ILP and MC Zhao Zhang 2003

Multiprocessor Highlights MESI Cache Coherence Protocol, Memory Consistency, ILP and MC Zhao Zhang 2003 1

MESI Protocol From local processor’s viewpoint, for each cache block Modified: Only I have

MESI Protocol From local processor’s viewpoint, for each cache block Modified: Only I have a copy and the copy has been modifed; must respond to any read/write request Exclusive-clean: Only I have a copy and the copy is clear; no need to inform others about my changes Shared: Someone else may have copy; have to inform others about my changes Invalid: The block has been invalidated (possibly on the request of someone else) Actions highlight: Have read misses on a block: send read request onto bus Have write misses on a block: send write request onto bus Receive bus read request: transit the block to shared state Receive bus write request: transit the block to invalid state Must write back data when transiting from modified state 2

Memory Consistency Model Define memory correctness for parallel execution: Execution appears to the that

Memory Consistency Model Define memory correctness for parallel execution: Execution appears to the that of some correct execution of some theoretical parallel computer which has n sequential processors Particularly, remote writes must appear in a local processor in some correct sequence Typical memory consistency model: Sequential consistency n n n Memory read/writes are globally serialized; assume every cycle only one processor can proceed for one step, and write result appears on other processors immediately Processors do not reorder local reads and writes Note #possible sequences is an exponential function of #inst Total storing order n n Only writes are globally serialized; assume every cycle at most one write can proceed, and the write result appears immediately Processors may reorder local reads/writes without RAW dependence Processor consistency n n Writes from one processor appear in the same order on all other processors Processors may reorder local reads/writes without RAW dependence 3

Memory Consistency and ILP Sequential consistency, TSO and PC are strong consistency models (but

Memory Consistency and ILP Sequential consistency, TSO and PC are strong consistency models (but TSO and PC are relaxed consistency models) Why use weak consistency models (e. g. release consistency)? n n Otherwise, without speculative execution recovery, every write to shared data may take a full memory access latency (can afford 100 ns for every such write on 2 GHz 4 -way issue processors? ) For SC, reads cannot bypass any previous write (even without RAW dependence) Strong consistency may work efficiently with speculative execution in ILP (PC and TSO in practice; SC can be supported with speculative cache) 4