Materialization in Shape Analysis with Structural Invariant Checkers

  • Slides: 39
Download presentation
Materialization in Shape Analysis with Structural Invariant Checkers Bor-Yuh Evan Chang Xavier Rival George

Materialization in Shape Analysis with Structural Invariant Checkers Bor-Yuh Evan Chang Xavier Rival George C. Necula University of California, Berkeley August 27, 2007 ITU Copenhagen

What’s shape analysis? What’s special? Shape analysis tracks memory manipulation in a flow-sensitive manner.

What’s shape analysis? What’s special? Shape analysis tracks memory manipulation in a flow-sensitive manner. • Memory manipulation – Particularly important in systems code (in C) • Flow-sensitive – Many important properties • E. g. , Is an object freed? Is a file open? – Heap abstracted differently at different points • E. g. , Not based on allocation site 2

Example: Typestate with shape analysis Concrete Example Abstraction l l “red list” cur =

Example: Typestate with shape analysis Concrete Example Abstraction l l “red list” cur = l; program-specific predicate while (cur != null) { assert(cur is red); flow-sensitive heap abstraction make_purple(cur); make_purple(¢) could be l cur = cur!next; • • lock(¢) “purple “red l free(¢) list segment” list” open(¢) cur … } 3

Shape analysis is not yet practical Usability: Usability Choosing the heap abstraction difficult “red

Shape analysis is not yet practical Usability: Usability Choosing the heap abstraction difficult “red list” Space Invader [Distefano et al. ] “red list” Built-in high-level predicates - Hard to extend + No additional user effort Parametric in low-level, analyzer-oriented predicates red(n) Æ n 2 reach(l) TVLA [Sagiv et al. ] + Very general and expressive - Hard for non-expert Parametric in high-level, developer-oriented predicates “red list” Our Proposal + Extensible + Easier for developers 4

Shape analysis is not yet practical Scalability: Scalability Finding right level of abstraction difficult

Shape analysis is not yet practical Scalability: Scalability Finding right level of abstraction difficult Over-reliance on disjunction for precision “purple “red list segment” list” l cur developer emp shape analyzer l cur Ç Ç l cur Ç l l, cur Ç Ç Ç l, cur l cur 5

Hypothesis The developer can describe the memory in a compact manner at an abstraction

Hypothesis The developer can describe the memory in a compact manner at an abstraction level sufficient for the properties of interest (at least informally). • Good abstraction is program-specific l “purple list segment” “red list” abstraction cur developer ? shape analyzer 6

Observation Checking code expresses a shape invariant and an intended usage pattern. bool redlist(List*

Observation Checking code expresses a shape invariant and an intended usage pattern. bool redlist(List* l) { if (l == null) return true; else return l!color == red && redlist(l!next); l l } 7

Proposal An automated shape analysis with a memory abstraction parameterized by invariant checkers bool

Proposal An automated shape analysis with a memory abstraction parameterized by invariant checkers bool redlist(List* l) { if (l == null) return true; else return l!color == red && redlist(l!next); } checkers shape analyzer • Extensible – Abstraction based on the developer-supplied checkers • Targeted for Usability – Global data structure specification, local invariant inference • Targeted for Scalability – Based on the hypothesis 8

Shape analysis is an abstract interpretation on memory states with … • Materialization (partial

Shape analysis is an abstract interpretation on memory states with … • Materialization (partial concretization) l, cur • “red list” l, cur To perform strong updates “red list” l, cur “red list” l cur • And widening for termination “red list” l cur l “purple “red list segment” list” cur 9

Outline • Memory abstraction – Restrictions on checkers – Challenge: Intermediate invariants • Materialization

Outline • Memory abstraction – Restrictions on checkers – Challenge: Intermediate invariants • Materialization by forward unfolding – Where and how – Challenge: Unfolding segments • Materialization by backward unfolding – Challenge: Back pointers • Deciding where to unfold generically 10

Abstract memory using checkers “Some number of Graphs ® ® ¯ f values (address

Abstract memory using checkers “Some number of Graphs ® ® ¯ f values (address or null) ® points-to relation ®@ f ¯ ® c c points-to edges that satisfies checker c” checker run c(®) ¯ partial run ? Example “Disjointly, ®!next = ¯, °!next = ¯, and ¯ is a list. ” ® next ¯ ° list next disjoint memory regions (¤) 11

Checkers as inductive definitions bool list(List* l) { if (l == null) return true;

Checkers as inductive definitions bool list(List* l) { if (l == null) return true; else return list(l!next); } list(l) Disjointness Checker run can list(…) dereference any object field only once ® : = 9¯. list emp ® = null ® next emp ® ® next ¯ Ç list ® null (® = null) null next null … 12

What can a checker do? • In this talk, a checker … – –

What can a checker do? • In this talk, a checker … – – is a pure, recursive function dereferences any object field only once during a run only one argument can be dereferenced (traversal arg) has only additional pointer parameters Traversal argument bool dll(Dll* l, Dll* prev) { : = 9¯. ® dll(½) if (l == null) return true; Only fields emp else Ç from traversal ® = null return l!prev == prev argument && dll(l!next); ½ ® ¯ dll(®) prev next } ® null 13

Example checker: Two-level skip list ® : = 9¯, °. skip 1 emp ®

Example checker: Two-level skip list ® : = 9¯, °. skip 1 emp ® = null ® skip 0(°) emp Ç ®=° skip ® next skip ¯ skip 0(g) ° : = 9¯. ® skip 1 next Ç null ¯ ® null skip 0(g) ® ° skip skip next next 14

back to the abstract domain … bool redlist(List* l) { if (l == null)

back to the abstract domain … bool redlist(List* l) { if (l == null) return true; else return l!color == red && redlist(l!next); } checkers shape analyzer

Challenge: Intermediate invariants assert(redlist(l)); cur = l; l redlist while (cur != null) {

Challenge: Intermediate invariants assert(redlist(l)); cur = l; l redlist while (cur != null) { l purplelist make_purple(cur); Prefix Segment Described cur = cur!next; } assert(purplelist(l)); by ? l cur redlist Suffix Described by checkers purplelist 16

Prefix segments as partial checker runs Abstraction Checker Run Formula l purplelist cur c

Prefix segments as partial checker runs Abstraction Checker Run Formula l purplelist cur c purplelist(l) c( ) purplelist(…) c(…) purplelist(cur) c(…) c(…) Doesn’t quite work because we need materialization c( ) ¤– c( ) ? 17

Outline • Memory abstraction – Restrictions on checkers – Challenge: Intermediate invariants • Materialization

Outline • Memory abstraction – Restrictions on checkers – Challenge: Intermediate invariants • Materialization by forward unfolding – Where and how – Challenge: Unfolding segments • Materialization by backward unfolding – Challenge: Back pointers • Deciding where to unfold generically 18

Flow function: Unfold and update edges x!next = x!next; x Unfold inductive definition x

Flow function: Unfold and update edges x!next = x!next; x Unfold inductive definition x next list materialize: x!next, x!next Ç x Strong updates using disjointness of regions next list update: x!next = x!next x next list 19

Unfolding: where, how, and why ok x!next = x!next; x next list materialize: x!next,

Unfolding: where, how, and why ok x!next = x!next; x next list materialize: x!next, x!next x next Ç x next list • Where – “Reach” a traversal argument with x!next • How and Why Ok (concretizations same) – By definition 20

What about unfolding segments? ® list x ¯ list y materialize: x!next ®=¯ ®

What about unfolding segments? ® list x ¯ list y materialize: x!next ®=¯ ® list x, y Ç ® x next ° list ¯ y list(®) ¤– list(¯) emp Ç ®@f ° ¤ (list(°) ¤– list(¯)) 21

Segment connector (for unfolding) Concrete store ¾ : Val ! Val valuation º :

Segment connector (for unfolding) Concrete store ¾ : Val ! Val valuation º : Sym. Val ! Val “unfolded” points-to “folded” pure recursive formula Inductive calls Definitions c(®) : = … Ç (Mu ¤ Mf Æ F) Ç … ¾, º ² c(®) ¤= c 0(® 0) iff there exists an i such that c(®) ¤=i c 0(® 0) [¢], º ² c(®) ¤=0 c(® 0) iff º(®) = º(® 0) ¾, º ² c(®) ¤=i+1 c 0(® 0) iff there exists a disjunct (Mu ¤ Mf ¤ c 00(¯) Æ F) such that º satisfies [actuals/formals]F and ¾, º ² [actuals/formals](Mu ¤ Mf ¤ c 00(¯) ¤=i c 0(® 0)) 22

Basic properties of segments • If ¾, º ² c(®) ¤= c 0(® 0),

Basic properties of segments • If ¾, º ² c(®) ¤= c 0(® 0), then ¾, º ² c(®) ¤– c 0(® 0) – If ¾, º ² (c(®) ¤= c 0(® 0)) ¤ c 0(® 0), then ¾, º ² c(®) (elimination) • [¢], º ² c(®) ¤= c(®) (reflexivity) • If ¾, º ² (c(®) ¤= c 0(® 0)) ¤ (c 0(® 0) ¤= c 00(® 00)), 00 00 ®then ¾, º ² c(®) ® 0 ¤= c (® ) ® 00 c c 0 c 00 (transitivity) 23

Outline • Memory abstraction – Restrictions on checkers – Challenge: Intermediate invariants • Materialization

Outline • Memory abstraction – Restrictions on checkers – Challenge: Intermediate invariants • Materialization by forward unfolding – Where and how – Challenge: Unfolding segments • Materialization by backward unfolding – Challenge: Back pointers • Deciding where to unfold generically 24

Challenge: Back pointers Example: Example Removal in doubly-linked lists • Traversal on ‘next’ field

Challenge: Back pointers Example: Example Removal in doubly-linked lists • Traversal on ‘next’ field to find element to remove: ® : = 9¯. dll(½) emp l dll(null) cur dll(°) ½ • Materialize ‘cur!prev’ and remove ‘cur’: ® l l Need to dll(°) unfold dll(null) cur “backward” dll(°) next dll(null) dll(°) cur ° prev Ç ® = null prev ® next ¯ : = 9¯. dll 0(½) emp Ç ® = null dll(°) ½ next dll(®) ® null ® prev ¯ dll 0(®) ® null 25

Backwards unfolding by forwards unfolding i+1 dll(°) ° prev dll(null) split (lemma) i dll(e)

Backwards unfolding by forwards unfolding i+1 dll(°) ° prev dll(null) split (lemma) i dll(e) dll(null) ± 1 dll(°) ° prev dll(e) unfold forward at ± i dll(null) dll(e) e prev ± next ´ 0 dll(±) dll(°) ° prev ¯ ¯ ¯ reduce ´ = ¯, ± = ° i dll(null) dll(e) e prev ± ¯ next prev 26

Outline • Memory abstraction – Restrictions on checkers – Challenge: Intermediate invariants • Materialization

Outline • Memory abstraction – Restrictions on checkers – Challenge: Intermediate invariants • Materialization by forward unfolding – Where and how – Challenge: Unfolding segments • Materialization by backward unfolding – Challenge: Back pointers • Deciding where to unfold generically 27

Deciding where to unfold A pointer that Where in the • Observations: indicate (with

Deciding where to unfold A pointer that Where in the • Observations: indicate (with types) what Observations Canmay materialize traversal it may these materialized fields are materialized forfields a checkerbeparameter types levels i …, fnhlnni } ¿ : : = { f 1 hl 1 i, l : : = n | unk • Levels c-n … c-1 Level -1: Materialized just before this call c 0 c 1 … cm Level 0: Materialized in this call. 28

Example: Doubly-linked lists ® : {nexth 0 i, prevh 0 i}, ½ : {nexth-1

Example: Doubly-linked lists ® : {nexth 0 i, prevh 0 i}, ½ : {nexth-1 i, prevh-1 i} ® : = dll(½) 9(¯ : {nexth 1 i, prevh 1 i}). emp prev ® next Backward unfolding parameter ½ has level -1 Ç ® = null ½ Before: Traversal argument had level 0 fields (implicitly) ¯ dll(®) ® null 29

Example: Alternative doubly-linked list ® : {nexth 0 i, prevh-1 i} ® : =

Example: Alternative doubly-linked list ® : {nexth 0 i, prevh-1 i} ® : = npdll ® 9(¯ : {nexth 2 i, prevh 1 i}). emp ® = null ® next ® : {nexth 1 i, prevh 0 i}, ½ : {nexth-1 i, prevh-2 i} ¯ Ç npdll 0(®) ® null npdll 0(½) : = 9(¯ : {nexth 1 i, prevh 1 i}). emp Ç ® = null ½ prev ® npdll ® null 30

Types can be inferred automatically Checking ® f { fh 0 i } <:

Types can be inferred automatically Checking ® f { fh 0 i } <: typeof(®) ¯ c typeof(¯) – 1 <: declared_typeof(¼) (where c(¼) : = …) { fhunki, ghunki } { fh 0 i } { gh 1 i } {} Inference using a fixedpoint computation with types initialized to { } 31

Summary: Enabling materialization anywhere • Defined segments as partial checker runs directly (inductively) –

Summary: Enabling materialization anywhere • Defined segments as partial checker runs directly (inductively) – For forward unfolding – Backward unfolding derived from forward unfolding • Checker parameter types with levels – For deciding where to unfold – Inferable and does not affect soundness 32

Summary: Given checkers, everything is automatic bool redlist(List* l) { if (l == null)

Summary: Given checkers, everything is automatic bool redlist(List* l) { if (l == null) return true; else return l!color == red && redlist(l!next); } checkers type pre-analysis unfolding and update widening abstract interpretation shape analyzer 33

Conclusion • Invariant checkers can form the basis of a memory abstraction that –

Conclusion • Invariant checkers can form the basis of a memory abstraction that – Is easily extensible on a per-program basis – Expresses developer intent • Critical for usability • Prerequisite for scalability • Enabling materialization anywhere – Inductive segments – Pre-analysis on checkers to decide where to unfold robustly 34

What can checker-based shape analysis do for you?

What can checker-based shape analysis do for you?

Challenge: Termination and precision last = l; cur = l!next; Observation while (curiterates !=

Challenge: Termination and precision last = l; cur = l!next; Observation while (curiterates != null) { Previous // “less … cur, unfolded” last … are if (…) last = cur; cur = cur! next; } Fold into checker edges But where and how much? next l, last l l next cur last list next cur list widen (canonicalize, blur) l list last next list cur list 36

last = l; cur = l!next; while (cur != null) { if (…) last

last = l; cur = l!next; while (cur != null) { if (…) last = cur; cur = cur! next; } History-guided folding • Match edges to identify where to fold • Apply local folding rules l next l, last l next cur last list next cur list l, last v l list ? last ? Yes l list last next cur list 37

Summary: Enabling checker-based shape analysis • Built-in disjointness of memory regions – As in

Summary: Enabling checker-based shape analysis • Built-in disjointness of memory regions – As in separation logic – Checkers read any object field only once in a run • Generalized segment abstraction – Based on partial checker runs c • Generalized folding into inductive predicates – Based on iteration history (i. e. , a widening operator) l, cur list l next cur list l list cur list 38

Experimental results Benchmark Lines of Code Analysis Time Max. Num. Graphs at a Program

Experimental results Benchmark Lines of Code Analysis Time Max. Num. Graphs at a Program Point Max. Num Iterations at a Program Point list reverse 019 0. 007 s 1 03 list remove element 027 0. 016 s 4 06 list insertion sort 056 0. 021 s 4 07 search tree find 023 0. 010 s 2 04 skip list rebalance 033 0. 087 s 6 07 scull driver 894 9. 710 s 4 16 • Verified structural invariants as given by checkers are preserved across data structure manipulation • Limitations (in scull driver) – Arrays not handled (rewrote as linked list), char arrays ignored • Promising as far as number of disjuncts 39