Abstract Transformers for Thread Correlation Analysis Michal Segalov
Abstract Transformers for Thread Correlation Analysis Michal Segalov, TAU Tal Lev-Ami, TAU Roman Manevich, TAU G. Ramalingam, MSR India Mooly Sagiv, TAU
Motivation n A novel approach for static analysis of highly concurrent algorithms n n n Verify correctness Alert on (possible) bugs Challenges n Fine-grained syncronization n n Requires subtle reasoning on thread interference Heap data structures n Unbounded state space
Concurrent Set [M. Maged SPAA’ 02] remove(key) { add(node) { while (true) { <prev, cur, next, found> = locate(key) <prev, cur, next, found> = locate(node. key) if (!found) return false; if (!CAS(cur. next, <0, next>, <1, next>) ) continue; node. next = cur if (CAS(prev. next, <0, curr>, <0, next>) Delete. Node(curr); return true; else locate(key); return true; set implemented <0, cur>, <0, node>)) by linked list } } } heavy use of CAS( Compare and Swap) fine-grained concurrency locate(key) { restart: pred = Head ; <tmp, curr> = pred. next; while (true) { if (curr == null) return <null, false>; <cmark, next> = curr. next; ckey = curr. key; if (pred. next != <0, curr>) goto restart; if (!cmark) { if (ckey >= key) return <prev, curr, next, (key == ckey) > pred = curr; } else { if (CAS(pred. next, <0, curr>, <0, next>)) Delete. Node(curr); else goto restart; } curr = next; } } 3
remove(key) { add(node 1) { while (true) { <prev 2, cur 2, next 2, found> = locate(key) <prev 1, cur 1, next 1, found>=locate( node 1. key) if (!found) return false; if (!CAS(cur 2. next, <0, next 2>, <1, next 2>) continue; node 1. next = curr 1 if (CAS(prev 1. next, <0, curr 1>, <0, node 1>)) if (CAS(prev 2. next, <0, cur 2>, <0, next 2>)) Delete. Node(curr 2); return true; else return true; } locate(key); Ta: add(3) } } CAS fails due to mark bit Tr: remove(2) next 2 4 curr 2 prev 2 2 m 1 3 Head prev 1 curr 1 node 1 4
Detecting a Bug n A node is removed before it is marked remove(key) { while (true) { <prev, cur, next, found> = locate(key) if (!found) return false; if (!CAS(cur. next, <0, next>, <1, next>) continue; if (CAS(prev. next, <0, cur>, <0, next>)) Delete. Node(cur); else locate(key); } } 5
Concurrent Set [M. Maged SPAA’ 02] remove(key) { add(node 1) { while (true) { <prev 2, cur 2, next 2, found> = locate(key) <prev, cur, next, found> = locate(node 1. key) if (!found) return false; if (CAS(prev 2. next, <0, curr 2>, <0, next 2>)) Delete. Node(cur 2); node 1. next = cur 1 if (CAS(prev 1. next, <0, cur 1>, <0, node 1>)) if (!CAS(cur 2. next, <0, next 2>, <1, next 2>) continue; else return true; } locate(key); } } next 2 curr 2 4 2 prev 2 Ta: add(3) Tr: remove(2) curr 1 1 3 Head prev 1 node 1 A memory leak 6
Main Results n Thread-correlation analysis n n Precise enough to prove properties of finegrained concurrent programs n n A new kind of thread-modular analysis Not automatically proven before Two transformer enhancements n n n Summarizing Effects Summarizing Abstraction On a concurrent set imp. speedup is x 34! 7
Thread-modular Abstraction n Abstraction from point of view of one thread n n Maintains local store and global store precisely Abstracts away local stores of all other threads Naturally handles unbounded number of threads Imprecise modeling thread interactions main thread preciseconcurrency information n Fine-grained program t. state � � � 8
Thread Correlation Abstraction n Refines thread-modular abstraction to reason about thread interactions Tracks correlations between local stores of every two threads secondary thread track less precisely 3 levels of abstraction main thread n n n Main thread Secondary thread All other threads precise information Main-Second abstracted asymmetrically 9
Singleton Buffer Example boolean empty = true; Object b = null; produce() { consume() { 1: Object p = new(); Object c; 2: await (empty) then { 4: await (!empty) then { b = p; c = b; empty = false; empty = true; } } 3: 5: use(c); } 6: dispose(c); 7: } Safe Dereference No Double free 10
Thread Modular Abstraction c 1 c 2 c 3 c 4 empty 6: 4: C 1 4: 6: C 2 4: 6: C 3 4: C 4 c 1 c 2 c 3 c 4 6: empty c 2 4: empty c 3 4: empty … 6: c 4 4: empty 11
Thread Modular Abstraction c 1 c 2 c 3 c 4 empty 6: 4: C 1 4: 6: C 2 4: 6: C 3 4: c 2 c 3 c 4 6: empty 4: empty c 3 4: empty … 6: 4: 4: C 4 c 1 c 2 6: c 4 4: empty 12
Thread Correlation Abstraction c 1 c 2 c 3 c 4 empty 6: 4: C 1, C 2 4: 6: C 1, C 3 4: 6: 4: C 1, C 4 6: 4: 6: C 2, C 1 C 2, C 3 C 2, C 4 c 1 c 2 c 1 c 3 c 1 c 4 c 2 c 1 c 2 c 3 c 2 c 4 6: empty 4: empty c 1 c 2 6: empty c 1 c 3 6: empty c 1 c 4 6: empty 2 -thread factoid … c 2 c 1 4: empty c 2 c 3 c 2 c 4 6: empty c 2 c 3 c 2 c 4 4: empty 13
Concretization Example c 1 c 2 c 3 c 4 empty 6: 4: C 1, C 2 4: 6: C 1, C 3 4: 6: 4: C 1, C 4 6: 4: c 1 c 3 c 1 c 4 c 2 c 1 6: empty c 1 c 2 6: empty c 3 6: empty c 1 c 4 6: empty … 6: C 2, C 1 c 2 c 1 6: c 2 c 1 4: empty 4: C 2, C 3 c 2 c 3 4 empty c 2 c 3 4: C 2, C 4 c 2 c 4 4: empty c 2 c 4 6: empty c 2 c 3 c 2 c 4 4: empty 14
Abstractions Compared n n Thread-modular abstraction 2 levels of abstraction main thread precise information n n Thread-correlation abstraction 3 levels of abstraction main thread precise information secondary thread track less precisely 15
Point-wise Transformer 6: C 1: dispose(c 1) c 1 c 3 b empty 6: C 1: dispose(c 1) b c 1 c 3 empty 7: 16
Point-wise Transformer 6: C 1: dispose(c 1) b c 2 c 3 c 1 c 4 empty 5: ? ? ? : 6: C 1: dispose(c 1) Safe? ? Single factoid – no… All factoids – Yes! 17
Build 3 -Thread Factoids (model effect C 1 has on C 2) C 1, C 2 c 1 c 2 empty 6: C 1, C 3 c 1 c 3 empty 6: C 1, C 4 c 1 c 4 empty 6: C 2, C 1 c 2 c 1 empty 6: c 2 c 1 empty 4: C 2, C 3 C 2, C 4 c 2 c 3 c 2 c 4 empty 4: c 2 c 3 c 2 c 4 empty 6: C 1: Executing C 2: Tracked C 3: Other …. . 18
3 -Thread Factoids c 1 c 2 c 1 c 3 empty 6: C 2, C 3 6: c 2 c 3 empty 6: 4: c 2 c 3 C 1, C 2 c 1 c 2 empty 6: C 1, C 3 c 1 c 3 empty 6: C 2, C 1 c 2 c 1 empty 6: c 2 c 1 empty 4: c 2 c 3 empty 6: c 2 c 3 C 1: Executing C 2: Tracked empty 6: C 3: Other 19
6: C 1: dispose(c) (exec) c 1 c 2 c 3 empty 6: 6: c 1 c 2 empty 6: 4: c 1 c 2 c 3 empty 6: 7: c 3 c 1 c 2 c 3 empty 7: 4: C 1: Executing C 2: Tracked C 3: Other 20
6: C 1: dispose(c) (project) c 1 c 2 c 1 c 3 c 2 c 3 empty 7: 4: empty 6: 7: C 2, C 1 c 2 c 1 C 2, C 3 c 2 c 3 empty 6: c 2 c 1 c 2 c 3 C 1: Executing empty 4: C 2: Tracked C 3: Other 21
Transformers Spectrum efficient point-wise transformer (thread-modular) efficient imprecise w. Summary Abstraction precise enough efficient w. Summarizing Effects precise enough more efficient most-precise transformer incomputable? precise baseline transformer precise enough quadratic blow-ups 22
Reducing Quadratic Blow-ups n n |3 -thread factoids| 2) Summarizing Effects n n n O(|2 -thread Memoize computations on common sub states No over-approximation Summary Abstraction n n Aggressive abstraction to executing thread Crucial for performance 23
Memoizing PCs 6: C 1: dispose(c) c 1 c 2 c 3 6: 6: empty 6: 5: empty C 1, C 2 c 1 c 2 exec 6: C 1: dispose(c) 3 -T factoids C 1, C 3 C 2, C 1 c 3 c 2 c 1 c 2 c 3 7: 6: empty 7: 5: empty C 2, C 3 c 2 c 3 C 2, C 1 c 2 c 1 proj C 2, C 3 c 2 c 3 6: empty 4: empty 6: empty c 1 c 2 c 1 c 3 c 2 c 1 c 2 c 3 6: empty 5: empty c 2 c 3 C 1: Executing C 2: Tracked 5: empty C 3: Other 24
Memoizing PCs 6: C 1: dispose(c) C 1, C 2 c 1 c 2 C 1, C 3 c 1 c 3 C 2, C 1 c 2 c 1 C 2, C 3 c 2 c 3 6: empty 4: empty c 1 c 2 c 1 c 3 c 2 c 1 c 2 c 3 6: empty 5: empty 6: empty c 2 c 3 5: empty these states identical up to the PCs which are invisible to the executing thread C 1: Executing C 2: Tracked C 3: Other 25
Memoizing PCs 6: C 1: dispose(c) c 1 c 2 c 3 6: C 1, C 2 c 1 c 2 exec 6: C 1: dispose(c) empty 3 -T factoids C 1, C 3 C 2, C 1 c 3 c 2 c 1 C 2, C 3 c 2 c 3 7: C 2, C 1 c 2 c 1 empty proj C 2, C 3 c 2 c 3 6: empty c 2 c 3 c 2 c 1 c 2 c 3 6: empty 5: empty 6: empty 4: empty c 1 c 2 c 1 c 3 c 2 c 1 6: empty 5: empty frame c 1 c 2 c 3 C 1: Executing C 2: Tracked 5: empty C 3: Other 26
Evaluation n Implemented on top of TVLA n n n Unbounded number of threads Unbounded number of objects Thread-modular not precise enough Thread correlations analysis proved required properties Reproduced injected errors 27
Speedups Relative to Baseline 28
Related Work n Thread-modular abstractions n Finite-state model checking [Flanagan & Qadeer, SPIN’ 03] n Environment abstraction [Clarke et al. , VMCAI’ 06, TACAS’ 08] n Thread-modular shape analysis n Coarse-grained concurrency [Gotsman et al. , PLDI’ 07] n Fine-grained concurrency CAV’ 08] [SAS’ 08, 29
Summary n New analysis for concurrent systems n n n Two important transformer enhancements n n Thread-correlations abstraction Handles unbounded number of threads Summarizing effects Summary abstraction Reduce quadratic blow-ups Empirically evaluated 30
Thanks!
Which Properties Did You Prove? Data Structure Invariants Linearization Hand Over Hand DCAS Lazy List Maged Opt 32
Why 3 Levels of Abstraction? n Generalizes naturally by maintaining local stores in second level n n n k k=1 suffices for our benchmarks Same principles for optimizing More than levels of abstraction complicate reasoning – usefulness not obvious 33
What is the Increment Relative to CAV’ 08? n n CAV’ 08 uses two levels of abstraction – thread-modular Baseline transformer – too expensive – timed-out on some of our benchmarks 34
Does Baseline Transformer Make Sense? n n Transformer used by earlier CAV’ 08 paper Starting point of [Flanagan & Qadeer, SPIN’ 03] n We added optimizations by distinguishing 3 levels of abstraction 35
Which Properties Did You Prove? Data Structure Invariants Linearization Hand Over Hand DCAS ( And in other thread too ) CAS Lazy List Michael Opt 36
Summary Abstraction n n Sound approximation heuristics Details in paper baseline transformer precise reasoning coarse reasoning w. summary abstraction Reduce preciseness coarse reasoning 37
Running Times 41
Types of Algorithms Lock free Wait free No No (locate) Lazy List (locate) Michael No Michael Opt (locate) Hand Over Hand DCAS 42
Baseline Transformer 3 -thread substate factoids exec(tracked) statement st exec (1 st) statement st 3 -thread substate factoids � factoids 43
Conditions Ensuring No Loss of Precision n n Abstraction does not distinguish between local stores with same footprint Footprint is idempotent 44
- Slides: 41