Velodrome A Sound and Complete Dynamic Atomicity Checker

  • Slides: 23
Download presentation
Velodrome : A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs Cormac Flanagan

Velodrome : A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs Cormac Flanagan UC Santa Cruz Stephen Freund Williams College Jaeheon Yi UC Santa Cruz

Concurrency and Race Conditions int bal = 0; Thread 1 t 1 = bal

Concurrency and Race Conditions int bal = 0; Thread 1 t 1 = bal = t 1 + 10 Thread 1 Thread 2 t 2 = bal t 1 = bal = t 2 - 10 bal = t 1 + 10 Thread 2 t 2 = bal = t 2 - 10 Race condition: two concurrent unsynchronized accesses (where at least one access is a write)

Race-Freedom is not Sufficient [PLDI 03] Thread 1 synchronized(m) { t 1 = bal;

Race-Freedom is not Sufficient [PLDI 03] Thread 1 synchronized(m) { t 1 = bal; } synchronized(m) { bal = t 1 + 10; }� Thread 2 synchronized(m) { t 2 = bal; bal = t 2 - 10; } acquire(m) t 1 = bal release(m) acquire(m) bal = 0 release(m) acquire(m) bal = t 1 + n release(m)

Serializable execution Atomicity equivalent to some serial execution Serial execution each atomic block excutes

Serializable execution Atomicity equivalent to some serial execution Serial execution each atomic block excutes contiguously Thread 1 atomic{ { atomic synchronized(m) { t 1 = bal; x = 0 bal = t 1 + 10; t 1 = bal } y = 0 } acquire(m) bal = t 1 + 10 atomic { x = 0 atomic { acquire(m) t 1 = bal bal = t 1 + 10 release(m) y = 0 release(m) Thread 2 } x = 0; y = 0; acquire(m) } } y = 0 Atomic code blocks should be serializable on every execution – sequential reasoning - easier to debug & verify code – atomicity violations often reveal synch. defects – matches practice - most Java methods are atomic

Race Freedom and Atomicity Complementary Correctness Properties Race-freedom – program behaves as if running

Race Freedom and Atomicity Complementary Correctness Properties Race-freedom – program behaves as if running on a sequentially consistent memory model Atomicity – program behaves as if each atomic block is executed serially

Dynamic Checkers Race Detectors Incomplete (false alarms) Complete (no false alarms) Eraser [SBN 97].

Dynamic Checkers Race Detectors Incomplete (false alarms) Complete (no false alarms) Eraser [SBN 97]. . . Happens Before [Lamport 78]. . . • Atomizer: based on Lock. Set 35% false alarms [Flanagan-Freund 04] Atomicity • block-based, commit-node Checkers [Wang-Stoller 06] • 2 -phase lock [Xu-Bodik-Hill 06] • model checking [Hatcliff et al. 04] Velodrom e complete dynamic checker for atomicity

int x = 0; volatile int b = 1; Thread 1 while (true) {

int x = 0; volatile int b = 1; Thread 1 while (true) { loop until b == 1; atomic { x = x + 100; b = 2; } } Thread 2 while (true) { loop until b == 2; atomic { x = x - 100; b = 1; } } Thread i accesses x only when b == i

Execution Trace Thread 1 while (true) { loop until b == 1; atomic {

Execution Trace Thread 1 while (true) { loop until b == 1; atomic { x = x + 100; b = 2; } } Thread 2 while (true) { loop until b == 2; atomic { x = x - 100; b = 1; } } test b == 2 atomic { t 1 = x x = t 1 + 100 test b == 2 b = 2 test b == 2 } atomic { t 2 = x test b == 1 x = t 2 - 100 b = 1 } test b == 1 atomic { t 1 = x x = t 1 + 100 b = 2 } test b == 2 atomic { t 2 = x x = t 2 - 100

test b == 2 Happens-Before Ordering on Operations program order synchronization order atomic {

test b == 2 Happens-Before Ordering on Operations program order synchronization order atomic { t 1 = x x = t 1 + 100 test b == 2 b = 2 test b == 2 } atomic { t 2 = x test b == 1 x = t 2 - 100 b = 1 communication order } test b == 1 atomic { t 1 = x x = t 1 + 100 b = 2 } test b == 2 atomic { t 2 = x x = t 2 - 100

test b == 2 A transaction is a dynamic execution of an atomic block

test b == 2 A transaction is a dynamic execution of an atomic block atomic { t 1 = x x = t 1 + 100 test b == 2 b = 2 test b == 2 Lift HB ordering on operations to HB ordering on transactions } atomic { t 2 = x test b == 1 x = t 2 - 100 b = 1 test b == 1 atomic { t 1 = x x = t 1 + 100 b = 2 } } test b == 2 atomic { t 2 = x x = t 2 - 100

test b == 2 A transaction is a dynamic execution of an atomic block

test b == 2 A transaction is a dynamic execution of an atomic block atomic { t 1 = x x = t 1 + 100 test b == 2 b = 2 test b == 2 Lift HB ordering on operations to HB ordering on transactions } atomic { t 2 = x test b == 1 x = t 2 - 100 b = 1 test b == 1 Theorem Transactional HB order has no cycles. if and only if Trace is serializable atomic { t 1 = x x = t 1 + 100 b = 2 } } test b == 2 atomic { t 2 = x x = t 2 - 100

test b == 2 Equivalent Serial Trace test b == 2 atomic { t

test b == 2 Equivalent Serial Trace test b == 2 atomic { t 1 = x x = t 1 + 100 b = 2 } test b == 2 test b == 1 atomic { t 2 = x x = t 2 - 100 b = 1 } atomic { t 1 = x x = t 1 + 100 b = 2 } test b == 2 atomic { t 2 = x

Atomicity Violation Thread 1 while (true) { loop until b == 2; atomic {

Atomicity Violation Thread 1 while (true) { loop until b == 2; atomic { x = x + 100; b = 2; } } X Thread 2 while (true) { loop until b == 2; atomic { x = x - 100; b = 1; } } atomic {. . . b = 2 } test b == 2 atomic { t 2 = x atomic { t 1 = x x = t 1 + 100 b = 2 } x = t 2 - 100 b = 1 } Cycle in transactional HB order trace is not serializable report atomicity violation

The Hard Part. . . Scaling Program may execute billions of instructions and billions

The Hard Part. . . Scaling Program may execute billions of instructions and billions of transactions – esp. unary transactions How to represent transactional HB order? – Infeasible to allocate and keep entire HB graph – Clock vectors not applicable

Garbage Collection If completed transaction has no in edges – will never have in-edges

Garbage Collection If completed transaction has no in edges – will never have in-edges – will never be in a cycle – can be collected test b == 2 atomic { t 1 = x x = t 1 + 100 test b == 2 b = 2 test b == 2 } test b == 1 Keep HB graph acyclic – cycles always indicate errors Reference counting GC – triggered when transaction completes test b == 1 atomic { t 2 = x x = t 2 - 100 b = 1. . b = 2

Avoiding Allocation and Node Re-Use Need to avoid allocating billions of unary transaction nodes

Avoiding Allocation and Node Re-Use Need to avoid allocating billions of unary transaction nodes If unary transaction has no in edges – will never have in-edges – will never be in a cycle – not even allocated! If unary transaction has a single in edge – re-use predecessor node – avoids allocation test b == 2 atomic { t 1 = x x = t 1 + 100 b = 2 } test b == 2 test b == 1 Transactional HB graph represented with a single node with no loss of precision

Velodrome : part of Road. Runner Instrumentation Framework Instrumented bytecode Road. Runner Instrumenter (+

Velodrome : part of Road. Runner Instrumentation Framework Instrumented bytecode Road. Runner Instrumenter (+ BCEL) Java bytecode plus atomicity annotations event stream T 1: T 2: begin_atomic acquire(lock 3) read(x, 5) write(y, 3) end_atomic release(lock 3) Velodrome Atomicity Checker Other analyses eg Atomizer, Eraser, . . . test b == 2 atomic { t 1 = x x = t 1 + 100 test b == 2 b = 2 } test b == 2 test b == 1 atomic { t 2 = x x = t 2 - 100 } b = 1 test b == 1 atomic { t 1 x x ==t 1 + 100 } b = 2 test b == 2 atomic { t 2 x x ==t 2 - 100 Error: method is not atomic at line 43

Experimental Results: Performance Base Instrumentation Atomizer Framework Velodrome GC & Reuse reduce graph size

Experimental Results: Performance Base Instrumentation Atomizer Framework Velodrome GC & Reuse reduce graph size by 104 - 106 Max. graph size usually < 20

Experimental Results: Precision 14 Java programs, ~250 KLOC – includes jigsaw, Java. Grande, spec.

Experimental Results: Precision 14 Java programs, ~250 KLOC – includes jigsaw, Java. Grande, spec. JBB, spec. JVM 238 Atomizer warnings 35% false alarms

Experimental Results: Precision 14 Java programs, ~250 KLOC – includes jigsaw, Java. Grande, spec.

Experimental Results: Precision 14 Java programs, ~250 KLOC – includes jigsaw, Java. Grande, spec. JBB, spec. JVM 105 Atomizer only warnings 80% false alarms 133 Velodrome errors 0% false alarms

Experimental Results: Precision 14 Java programs, ~250 KLOC – includes jigsaw, Java. Grande, spec.

Experimental Results: Precision 14 Java programs, ~250 KLOC – includes jigsaw, Java. Grande, spec. JBB, spec. JVM 84 Atomizer false alarms 21 Atomizer only errors 133 Velodrome errors 0% false alarms

Increasing Coverage: Directed Scheduling atomic void deposit(int n) { t 1 = bal; //

Increasing Coverage: Directed Scheduling atomic void deposit(int n) { t 1 = bal; // other thread may // modify bal here. . . bal = t 1 + n; } JVM Atomizer Warnings Velodrome On each Atomizer warning, pause executing thread Improves probability of executing non-serializable trace – from 30% to 70% on some benchmarks – currently exploring different scheduling policies Similar to [Sen PLDI 08] but for atomicity

Conclusions Velodrome – a sound and complete dynamic analysis for atomicity – performance competitive

Conclusions Velodrome – a sound and complete dynamic analysis for atomicity – performance competitive to earlier analyses – no false alarms => find and fix real bugs first – does careful blame assignment (see paper) Directed scheduling – improves coverage of complete dynamic analyses Atomicity – enables sequential reasoning – detects synchronization defects