Hardware and Software transactional memory and usages in

  • Slides: 15
Download presentation
Hardware and Software transactional memory and usages in MRE SALISHEV S. I.

Hardware and Software transactional memory and usages in MRE SALISHEV S. I.

Transactions in Memory transaction is the unitary set of memory accesses Memory transaction is

Transactions in Memory transaction is the unitary set of memory accesses Memory transaction is similar to DB transaction, similar ACID props q. Atomicity requires that each transaction is "all or nothing": if one part of the transaction fails, the entire transaction fails, and the memory state is left unchanged q. Consistency ensures that any transaction will bring the data in memory from one valid state to another q. Isolation ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially, i. e. one after the other q. Durability (Optional) Memory is changed only by transactions

Transactions vs. Locks Transactions Locks Speculative + - Safe Nesting + - Deadlocks -

Transactions vs. Locks Transactions Locks Speculative + - Safe Nesting + - Deadlocks - + Overhead More Less

Transactions and Objects Natural meaning of transactions in asynchronous actor model q. Transaction Atomic

Transactions and Objects Natural meaning of transactions in asynchronous actor model q. Transaction Atomic service provided by actor object q. Consistency, Isolation Guarantee of object Encapsulation q. Nested Transaction Call to owned object service

Software Transactional Memory Software implementation of memory transactions Implementation is based on Read/Write barriers

Software Transactional Memory Software implementation of memory transactions Implementation is based on Read/Write barriers Control on code side effects Transaction abortion is programmatically controlled Sequential consistency is expected by programmers ◦ Problems with reads and data consistency ◦ All reads to transactional variables wrapped in transactions Implementation ◦ Automatic transactions on all memory accesses ◦ Tremendous performance overhead ◦ Explicit transactional variable markup ◦ Additional work ◦ Error prone

Hardware Transactional Memory ◦ Extension to MESI cache coherence protocol ◦ All memory accesses

Hardware Transactional Memory ◦ Extension to MESI cache coherence protocol ◦ All memory accesses are transactional ◦ Sporadic abortion due to exhausted resources ◦ Supported in commercial products ◦ ◦ ◦ Azul Systems Vega 2 Intel Haswell IBM Power 8 IBM z. EC 12 IBM Blue. Gene/Q

Architecture comparison Nakaike, Takuya, et al. "Quantitative comparison of hardware transactional memory for Blue

Architecture comparison Nakaike, Takuya, et al. "Quantitative comparison of hardware transactional memory for Blue Gene/Q, z. Enterprise EC 12, Intel Core, and POWER 8. " Proceedings of the 42 nd Annual International Symposium on Computer Architecture. ACM, 2015.

MESI Protocol

MESI Protocol

Intel Transactional Synchronization Extensions (TSX) Haswell implements an unordered, single version, nested TM with

Intel Transactional Synchronization Extensions (TSX) Haswell implements an unordered, single version, nested TM with strong isolation. The TM tracks the read-set and write-set at a fixed 64 B cache line granularity. Hardware Lock Elision (HLE) ◦ Backward compatible instruction prefixes, which allow speculative execution of critical sections Restricted Transactional Memory (RTM) ◦ New synchronization instructions with transactional memory semantics

Hardware Lock Elision (HLE) Acquiring and Releasing a lock are atomic memory write operations

Hardware Lock Elision (HLE) Acquiring and Releasing a lock are atomic memory write operations ◦ ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG 8 B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG These operations have LOCK (0 x. F 0) prefix Two new prefixes are provided XACQUIRE and XRELEASE ◦ These two prefixes reuse REPNE / REPE prefixes (F 2 H / F 3 H) ◦ These prefixes were ignored on these opcodes before Haswell XACQUIRE ◦ starts the transaction ◦ adds lock address to read-set XRELEASE ◦ Commits the transaction If transaction aborts it is automatically restated without XACQUIRE

Restricted Transactional Memory (RTM) RTM is a full programmatic assess to the underlying TM

Restricted Transactional Memory (RTM) RTM is a full programmatic assess to the underlying TM XBEGIN offs ◦ Starts the transaction, offs is the abort handler relative address XEND ◦ Commits the transaction XABORT im 8 ◦ Aborts the transaction returns error code XTEST ◦ Tests if the transaction is executed

Transaction Abort Results EAX Register Bit Position Meaning 0 Set if abort caused by

Transaction Abort Results EAX Register Bit Position Meaning 0 Set if abort caused by XABORT instruction. 1 If set, the transaction may succeed on a retry. This bit is always clear if bit 0 is set. 2 Set if another logical processor conflicted with a memory address that was part of the transaction that aborted. 3 Set if an internal buffer overflowed. 4 Set if debug breakpoint was hit. 5 Set if an abort occurred during execution of a nested transaction. 23: 6 Reserved. 31: 24 XABORT argument (only valid if bit 0 set, otherwise reserved).

HTM Limitations • Transaction abort due to resource exhaustion • Non-conflicting Transactions not guarantied

HTM Limitations • Transaction abort due to resource exhaustion • Non-conflicting Transactions not guarantied to complete • Always need non-HTM fallback • False sharing • All data in the same cache line is considered atomic • False sharing on structure fields and bit fields commonly used in OS’es and File Systems • Cache misses due to postponed transaction commit • Cannot evict data not committed by transaction • Uncommitted data resides in cache taking cache space

HTM Software support • HTM-Lock • • • HTM transaction with lock based fallback

HTM Software support • HTM-Lock • • • HTM transaction with lock based fallback HTM is used for speculative execution of critical sections Lock semantics of user code No transaction support in programming language required Commonly used in Java • HTM-STM Library • • • STM transactions with HTM fast-path STM Library semantics of user code Transactional data access through library calls or pragmas • • • transactional_read, transactional_write #pragma tm_atomic Only transactional accesses are transaction safe No durability property Commonly used in C++ • Consistent language STM support with HTM fast-path • • STM transactions with HTM fast-path All accesses are transaction safe Durability property Currently consistent language support is only available in functional languages (Haskell, Clojure)

HTM Usages in MRE Speculative Execution of Critical Sections e. g. Azul Vega 2

HTM Usages in MRE Speculative Execution of Critical Sections e. g. Azul Vega 2 JVM ◦ ◦ Java Monitors are converted to transactions Speculative execution of critical sections On abort the old lock path is executed No semantic changes, all deadlocks are ours GC ◦ Minimizes synchronization penalty on concurrent reference tracing and updating Hybrid HTM-STM ◦ Performance optimization of STM ◦ Keeps all implications on changing language semantics