Towards a ModelDriven Framework for Dynamic Program Analysis

  • Slides: 35
Download presentation
Towards a Model-Driven Framework for Dynamic Program Analysis Thibault Béziers la Fosse, Massimo Tisi,

Towards a Model-Driven Framework for Dynamic Program Analysis Thibault Béziers la Fosse, Massimo Tisi, Jean-Marie Mottu Atlan. Mod team (Inria, IMT Atlantique, LS 2 N), Nantes, France 1

Mo. Disco - https: //www. eclipse. org/modisco/ Well-known model-driven reverse-engineering (MDRE) solution in Eclipse

Mo. Disco - https: //www. eclipse. org/modisco/ Well-known model-driven reverse-engineering (MDRE) solution in Eclipse Modeling [Hugo Brunelière, Jordi Cabot, Grégoire Dupé, Frédéric Madiot. Mo. Disco: a Model Driven Reverse Engineering Framework. Information and Software Technology, Elsevier, 2014, 56 (8), pp. 1012 -1032. ] 2

Beyond Reverse-Engineering Modisco components are often also used by other MDE tools, e. g.

Beyond Reverse-Engineering Modisco components are often also used by other MDE tools, e. g. for: ● technical/functional migration ● refactoring ● retro-documentation ● quality assurance ● business rule extraction ● non-functional property verification 3

Mo. Disco challenges Mo. Disco addresses the following problems (so the user does not

Mo. Disco challenges Mo. Disco addresses the following problems (so the user does not have to): ● Technical heterogeneity ○ avoid information loss due to the heterogeneity of legacy systems ● Structural complexity ○ improve comprehension of the (typically complex) legacy systems ● Scalability ○ efficiently analyze large-scale systems ● Adaptability/portability ○ foster the development of generic and reusable MDRE components, e. g. by intermediate metamodels, reusable transformations, parametrization 4

Mo. Disco principle 1 Bridge artefacts to the MDE technical space as soon as

Mo. Disco principle 1 Bridge artefacts to the MDE technical space as soon as possible with a lossless step Model Discoverers ● Quickly get initial “raw” models of the artifacts ● Metamodels close to the abstract syntax of the artifacts ● No information loss Examples: ● Java, JSP, XML (Open-Source) ● Cobol (Proprietary) 5

Mo. Disco principle 2 Abstractions as network of reusable query/transformation modules organized in three

Mo. Disco principle 2 Abstractions as network of reusable query/transformation modules organized in three levels: Scenario-specific Technology and scenario-independent 6

Mo. Disco principle 3 Rely as much as possible on standard metamodels, and foster

Mo. Disco principle 3 Rely as much as possible on standard metamodels, and foster the implementation of new standards Mo. Disco provides reference implementations of the standards of OMG’s Architecture Driven Modernization ADM task force: ● The Knowledge Discovery Metamodel (KDM) ● The Software Measurement Metamodel (SMM) ● The Generic Abstract Syntax Tree Metamodel (GASTM) 7

Dynamic analysis of legacy systems ● The current Mo. Disco performs only static types

Dynamic analysis of legacy systems ● The current Mo. Disco performs only static types of analysis ○ the system is never executed ● Observing executions of the system can provide useful information for reverse-engineering its semantics ● The fundamental building blocks are execution traces ○ sequences containing relevant information about particular executions over time ● We envision an evolution of Mo. Disco able to ○ Discover execution trace artefacts if they are already there ○ Otherwise produce them by executing the system 8

The vision Enabling a new generation of cross-language, native MDE tools for: ● Test

The vision Enabling a new generation of cross-language, native MDE tools for: ● Test coverage: statistics on statements that are executed by the test set ● Change impact analysis: which executions/tests are impacted by a source change ● Semantic model differencing: comparing the behavior of the system before and after the change ● Omniscient debugging: debugging enabling free traversal of the reached states, which includes going backward in the execution ● Runtime verification : checking whether or not an execution trace satisfies a temporal property 9

Execution Traces Outside of MDE: popular formats, e. g. Open Trace Format 2 (OTF

Execution Traces Outside of MDE: popular formats, e. g. Open Trace Format 2 (OTF 2) Within MDE: proposals for execution traces of executable models: ● Clone-based execution trace metamodel [Langer et al. ] ● Film. Strip models [Gogolla et al. ] Outside of MDE: popular tools, e. g. for analyzing parallel software: ● e. g. Vampir, TAU, Scalasca Within MDE: no completely model&transformation based tools Scalability problem: In this paper we start experimenting on a subcase, tracking executed statements for test impact analysis 10

Following the Mo. Disco principles Principle 1: Bridge execution traces to the MDE technical

Following the Mo. Disco principles Principle 1: Bridge execution traces to the MDE technical space as soon as possible with a lossless step Principle 2: Execution Trace Discovery as a plugin in the Technologies layer, enabling the construction of plugins in the Use Cases layer on top of it (e. g. deriving just the call graph) Principle 3: Execution Traces as instances of standard metamodels, referring to the standard metamodels of the system structure 11

Approach Code TS Source code Modeling TS Reverse Engineering Mo. Disco standard models Execution

Approach Code TS Source code Modeling TS Reverse Engineering Mo. Disco standard models Execution Traces Instrumentation Instrumented code 12

Approach package main; public class Factorial { public int fact(int n) { if (n

Approach package main; public class Factorial { public int fact(int n) { if (n == 1) { return 1; } return n * fact(n - 1); } } package main; import static org. junit. Assert. *; import org. junit. Test; public class Test. Factorial { @Test public void check. Fact() { Factorial f = new Factorial(); assert. Equals(1, f. fact(1)); } } 13

Traceability in Mo. Disco Java Model elements are connected to ASTNode. Source. Regions that

Traceability in Mo. Disco Java Model elements are connected to ASTNode. Source. Regions that are traces to the original position in the source text: ● start. Line ● start. Position ● end. Line ● end. Position 14

Static Model Discovery package main; : Model name: Model public class Factorial { public

Static Model Discovery package main; : Model name: Model public class Factorial { public int fact(int n) { if (n == 1) { return 1; } return n * fact(n - 1); } } : Package name: main 15

Static Model Discovery package main; : Model name: Model public class Factorial { public

Static Model Discovery package main; : Model name: Model public class Factorial { public int fact(int n) { if (n == 1) { return 1; : Class } return n * fact(n - 1); name: Factorial } } : Package name: main : Method name: fact 16

Static Model Discovery package main; : Model name: Model public class Factorial { public

Static Model Discovery package main; : Model name: Model public class Factorial { public int fact(int n) { if (n == 1) { return 1; : Class } return n * fact(n - 1); name: Factorial } } : Package name: main : Method name: fact : Statement start. Pos: 74 end. Pos: 88 17

Static Model Discovery package main; : Model name: Model public class Factorial { public

Static Model Discovery package main; : Model name: Model public class Factorial { public int fact(int n) { if (n == 1) { return 1; : Class } return n * fact(n - 1); name: Factorial } } : Statement start. Pos: 74 end. Pos: 103 : Package name: main : Method name: fact : Statement start. Pos: 92 end. Pos: 101 : Statement start. Pos: 110 end. Pos: 133 18

Static Model Discovery package main; import static org. junit. Assert. *; import org. junit.

Static Model Discovery package main; import static org. junit. Assert. *; import org. junit. Test; public class Test. Factorial { @Test public void check. Fact() { Factorial f = new Factorial(); assert. Equals(1, f. fact(1)); } } : Model name: Model : Class name: Factorial : Statement start. Pos: 74 end. Pos: 103 : Package name: main : Method name: fact : Statement start. Pos: 92 end. Pos: 101 : Class name: Test. Factorial : Method name: test. Fact : Statement start. Pos: 110 end. Pos: 133 19

Static Model Discovery During the static discovery, Mo. Disco needs to decide which are:

Static Model Discovery During the static discovery, Mo. Disco needs to decide which are: ● The entry points of executions Target : Model name: Model : Class name: Factorial Tests : Package name: main : Method name: fact : Class name: Test. Factorial : Method name: test. Fact ○ e. g. main, tests ● The part of the system under observation ○ e. g. no execution trace in standard libraries : Statement start. Pos: 74 end. Pos: 103 : Statement start. Pos: 92 end. Pos: 101 : Statement start. Pos: 110 end. Pos: 133 20

Code Instrumentation in Java Two main options: ● Bytecode instrumentation ○ e. g. by

Code Instrumentation in Java Two main options: ● Bytecode instrumentation ○ e. g. by the ASM library ○ commonly used by tools like Ja. Co ○ problem: standard java bytecode just remembers line numbers, not possible to precisely refer to Mo. Disco model elements ○ we can extend the java compiler, but portability issues ● Source-code instrumentation ○ ○ e. g. by Spoon parses the AST tree to instrument the code problem: it impacts the building process needs sources (but static Mo. Disco too) 21

Source Code Instrumentation: Example package main; public class Factorial { public int fact(int n)

Source Code Instrumentation: Example package main; public class Factorial { public int fact(int n) { if (n == 1) { return 1; } return n * fact(n - 1); } } 22

Source Code Instrumentation: Example package main; public class Factorial { public int fact(int n)

Source Code Instrumentation: Example package main; public class Factorial { public int fact(int n) { match("main. Factorial", "fact", 74, 103); if (n == 1) { match("main. Factorial", "fact", 92, 101); return 1; } match("main. Factorial", "fact", 110, 133); return n * fact(n - 1); } } 23

Executio n package main; public class Factorial { public int fact(int n) { match("main.

Executio n package main; public class Factorial { public int fact(int n) { match("main. Factorial", "fact", 74, 103); if (n == 1) { match("main. Factorial", "fact", 92, 101); return 1; } match("main. Factorial", "fact", 110, 133); return n * fact(n - 1); } } package main; import static org. junit. Assert. *; import org. junit. Test; public class Test. Factorial { @Test public void check. Fact() { set. Method("main. Test. Factorial", "test. Factorial"); Factorial f = new Factorial(); assert. Equals(1, f. fact(1)); } } 24

Executio n package main; public class Factorial { public int fact(int n) { match("main.

Executio n package main; public class Factorial { public int fact(int n) { match("main. Factorial", "fact", 74, 103); if (n == 1) { match("main. Factorial", "fact", 92, 101); return 1; } match("main. Factorial", "fact", 110, 133); return n * fact(n - 1); } } : Model : Package name: Model name: main : Target : Method name: Factorial name: fact : Test name: Test. Factorial : Method name: test. Fact : Statement start. Pos: 74 end. Pos: 103 : Statement start. Pos: 92 end. Pos: 101 : Statement start. Pos: 110 end. Pos: 133 25

Executio n package main; public class Factorial { public int fact(int n) { match("main.

Executio n package main; public class Factorial { public int fact(int n) { match("main. Factorial", "fact", 74, 103); if (n == 1) { match("main. Factorial", "fact", 92, 101); return 1; } match("main. Factorial", "fact", 110, 133); return n * fact(n - 1); } } : Model : Package name: Model name: main : Target : Method name: Factorial name: fact : Test name: Test. Factorial : Method name: test. Fact : Statement start. Pos: 74 end. Pos: 103 : Statement start. Pos: 92 end. Pos: 101 : Statement start. Pos: 110 end. Pos: 133 26

Executio n package main; public class Factorial { public int fact(int n) { match("main.

Executio n package main; public class Factorial { public int fact(int n) { match("main. Factorial", "fact", 74, 103); if (n == 1) { match("main. Factorial", "fact", 92, 101); return 1; } match("main. Factorial", "fact", 110, 133); return n * fact(n - 1); } } : Model : Package name: Model name: main : Target : Test name: Test. Factorial : Method name: Factorial name: fact : Statement name: test. Fact executed. By start. Pos: 74 end. Pos: 103 : Statement start. Pos: 92 end. Pos: 101 : Statement start. Pos: 110 end. Pos: 133 27

Executio n package main; public class Factorial { public int fact(int n) { match("main.

Executio n package main; public class Factorial { public int fact(int n) { match("main. Factorial", "fact", 74, 103); if (n == 1) { match("main. Factorial", "fact", 92, 101); return 1; } match("main. Factorial", "fact", 110, 133); return n * fact(n - 1); } } : Model : Package name: Model name: main : Target : Test name: Test. Factorial : Method name: Factorial name: fact : Statement name: test. Fact executed. By start. Pos: 74 end. Pos: 103 : Statement start. Pos: 92 end. Pos: 101 : Statement start. Pos: 110 end. Pos: 133 28

Executio n package main; public class Factorial { public int fact(int n) { match("main.

Executio n package main; public class Factorial { public int fact(int n) { match("main. Factorial", "fact", 74, 103); if (n == 1) { match("main. Factorial", "fact", 92, 101); return 1; } match("main. Factorial", "fact", 110, 133); return n * fact(n - 1); } } : Model : Package name: Model name: main : Target : Test name: Test. Factorial : Method name: Factorial name: fact : Statement name: test. Fact executed. By start. Pos: 74 end. Pos: 103 : Statement executed. By start. Pos: 92 end. Pos: 101 : Statement start. Pos: 110 end. Pos: 133 29

Executio n package main; public class Factorial { public int fact(int n) { match("main.

Executio n package main; public class Factorial { public int fact(int n) { match("main. Factorial", "fact", 74, 103); if (n == 1) { match("main. Factorial", "fact", 92, 101); return 1; } match("main. Factorial", "fact", 110, 133); return n * fact(n - 1); } } : Model : Package name: Model name: main : Target : Test name: Test. Factorial : Method name: Factorial name: fact : Statement name: test. Fact executed. By start. Pos: 74 end. Pos: 103 : Statement executed. By start. Pos: 92 end. Pos: 101 : Statement start. Pos: 110 end. Pos: 133 30

Experimentation: Instrumentation time 31

Experimentation: Instrumentation time 31

Experimentation: JUnit Tests on JUnit The source code of JUnit 4 has a large

Experimentation: JUnit Tests on JUnit The source code of JUnit 4 has a large unit-test suite: ● ● ● 66 packages 817 test classes 293 classes in the SUT 3412 methods 6177 statements 12627 steps in test execution traces 32

Experimentation: Execution time 33

Experimentation: Execution time 33

Experimentation: Neo. EMF for long traces 34

Experimentation: Neo. EMF for long traces 34

Conclusion ● We believe that MDE tools are mature enough to start addressing in

Conclusion ● We believe that MDE tools are mature enough to start addressing in a native way all artifacts in dynamic analysis ● Some scalability solutions are still needed ○ in this paper for very long traces on Neo. EMF ○ in future for very long Film. Strip models ● Stay tuned for further news! Thanks 35