Rethinking Hardware and Software for Disciplined Parallelism Sarita














- Slides: 14

Rethinking Hardware and Software for Disciplined Parallelism Sarita V. Adve University of Illinois sadve@illinois. edu

Sequential CS 101 Java

Parallel CS 101 Threads Java

Parallel CS 101 Threads Java Data races

Parallel CS 101 Threads Java Data races Non-determinism

Parallel CS 101 Threads Java Memory Model Non-determinism Data races General-purpose parallel models are complex, abandon decades of sequential programming advances – Safety, modularity, composability, maintainability, …

The Problem Popular parallel languages are fundamentally broken

The Problem Theorem: Popular parallel languages are fundamentally broken Proof: See the Java Memory Model (+ unresolved bug)

The Problem Theorem: Popular parallel languages are fundamentally broken Proof: See the Java Memory Model (+ unresolved bug) Memory consistency model = what values can a read return? – 20+ years of research finally led to convergence – But extremely complex * Dealing with data races is very hard * Mismatch between hardware and software evolution We are building on a foundation where even

The Problem Theorem: Current parallel languages are fundamentally broken Proof: See the Java Memory Model (+ unresolved bug) Memory model = shared-memory? what values can a read will Banish return? 20+ years of research finally led to convergence – Sequential consistency for data-race-free programs is minimal – Java added MUCH complexity for safety/security * Minimal (complex) semantics for data races, but unresolved bug – C++, C added complexity for experts due to h/w – s/w

The Problem Theorem: Current parallel languages are fundamentally broken Proof: See the Java Memory Model (+ unresolved bug) Memory model = wild what values can a read will Banish shared-memory! return? 20+ years of research finally led to convergence – Sequential consistency for data-race-free programs is minimal – Java added MUCH complexity for safety/security * Minimal (complex) semantics for data races, but unresolved bug – C++, C added complexity for experts due to h/w – s/w

The Opportunity Need disciplined shared-memory parallel languages • Banish data races by design • Provide determinism by default • Support only explicit and controlled nondeterminism • Explicit side effects (sharing behavior, granularity, …) • ? ? ? Discipline is enforced Much momentum from software community

The Opportunity Memory model = core of parallel hardware/software interface Today’s hardware designed for wild shared memory – Cache coherence, communication architecture, scheduling, … – Inefficient in performance, power, resilience, complexity … Claim: Disciplined interface h/w simplicity + efficiency E. g. , race-free s/w race-free (MUCH SIMPLER) coherence protocols

The Approach • Software enforces disciplined behavior Software: safe, modular, composable, maintainable, … • Hardware designed for disciplined software Hardware: simple, scalable, power-efficient, … • Broad hardware/software research agenda – Interface: semantics, mechanisms at all levels, ISA, … – Rethink hardware: coherence, communication, layout, caches, … – Help software to abide by interface • Fundamental shift in software, hardware – But can be done incrementally – Memory models convergence from similar process • But this time let’s co-evolve h/w, s/w