Parameterized Unit Testing Theory and Practice Tao Xie

Faults, Errors & Failures □ Fault : A static defect in the software (i.

Mistake, Fault, Error, Failure Programmer makes a mistake. Fault (defect, bug) appears in the

What is fault, error, failure? □Doubling the balance and then plus 10 int cal.

What is fault, error, failure? □Should not allow withdrawal when there is a balance

Who should test? □Developer? Separate “quality assurance” group? □Programmer? User? Someone with a degree

Types of Test Activities □Testing can be broken up into four general types of

Test Design test values to satisfy coverage criteria or other engineering goal □ This

Test Automation Embed test values into executable scripts □ This is slightly less technical

Test Execution Run tests on the software and record the results □This is easy

Test Evaluation Evaluate results of testing, report to developers □ This is much harder

Types of Test Activities – Summary 1. Test design Design test values to satisfy

Testing Model – Black Box Testing □You know the functionality ◊ Given that you

Testing Model – White Box Testing □You know the code ◊ Given knowledge of

Group Exercise Ø A program needs to be developed so that given an integer

Black-box vs. White-box □White-box - look at code to write test ◊ Tests are

Types of Testing Ø Unit Testing (white) ◊ testing of individual hardware or software

Types of Testing -II Ø Functional/System Testing (black) ◊ testing conducted on a complete,

Types of Testing -III Ø Regression Testing (black and white) ◊ Regression testing is

Techniques for writing tests □Black-box (from specifications) ◊ Equivalence partitioning ◊ Boundary value analysis

Planning a Black Box Test Case ©L. Williams

Important Consideration for Black Box Test Planning □ Look at requirements/problem statement to generate.

Equivalence Class Partitioning □ Divide your input conditions into groups (classes). ◊ Input in

Equivalence partitioning □ Example: sorting □ sort(array, len) takes an array of integers of

Equivalence class test ideas □Any object: the null pointer □Strings: the empty string □Collections:

Equivalence class test ideas □Linked structures (trees, queues, etc. ) ◊ Empty ◊ Minimal

Boundary Value Analysis □Focus on boundaries. . . because a greater number of faults

Monopoly Decision Table □ If a Player (A) lands on property owned by another

Dirty/Failure Test Cases □ Can something cause division by zero? □ What if the

Coverage □Make sure tests cover each part of program ◊ Every statement ◊ Every

Coverage tools □Tool “instruments” the program □You run your tests, it builds database □Tool

White-box tests □Purpose: exercise all the code □Large number - take a long time

White Box Testing - Review □You know the code ◊ Given knowledge of the

Devising a prudent set of test cases □Equivalence Class/Boundary Value Analysis ◊ Still applies!

Recall: Mistake, Fault, Error, Failure Programmer makes a mistake. Fault (defect, bug) appears in

Fault & Failure Model Three conditions necessary for a failure to be observed 1.

Conversation 1: Tester Test Manager □Test Manager: Looks like the code under test is

Conversation 2: Tester Test Manager □Tester: Boss, following your command, I work very hard

Testing & Debugging □Testing : Finding inputs that cause the software to fail □Debugging

Test Case □Test Case Values/Test Input/Test Data : The values that directly satisfy one

Observability and Controllability □ Software Observability : How easy it is to observe the

What is a Unit Test? A unit test is a small program with assertions.

Unit Testing: Benefits □Design and specification ◊ by example □Code coverage and regression testing

Unit Testing: Measuring Quality □ Coverage: Are all parts of the program exercised? ◊

Advantages of tests as specs □Concrete, easy to understand □Don’t need new language □Easy

Disadvantages of tests as specs □Too specific □Hard to test that something can’t happen

Tests as specifications □Tests show to use the system □Tests need to be readable

Parameterized Unit Test □ A parameterized unit test is a small program that takes

Parameterized Unit Testing Parameterized Unit Tests □ serve as specifications □ can be leveraged

Test Generation Process x. Unit Attributes Pex Attributes // Foo. Test. cs [Test. Class,

PUTs separate concerns PUTs separate two concerns: (1) The specification of external behavior (i.

PUTs are algebraic specs A PUT can be read as a universally quantified, conditional

Pex 4 Fun – Turning Pex Online http: //pex 4 fun. com/default. aspx? language=CSharp&sample=_Template

Behind the Scene of Pex 4 Fun Dynamic Symbolic Execution (DSE) aka. Concolic Testing

Walkthrough: Unit Testing in VS □ Create Project □ Create Test Class □ Create

Dynamic Symbolic Execution in Pex Choose next path Solve void Cover. Me(int[] a) {

Pex is Part of Visual Studio 2015 Eneterprise Edition! □As new feature of “Intelli.

Domain Matrix for Testing Complex Condition 63

Guide Pex to Generate Test Data //[Pex. Method(Test. Emission. Filter = Pex. Test. Emission.

Parameterized Unit Tests Supported by Pex/Pex 4 Fun using System; using Microsoft. Pex. Framework.

Interface for Int. Set Class Int. Set { public Int. Set() {…}; public void

(Buggy) Implementation for Int. Set Class Int. Set { public Int. Set() {…}; public

Axioms for Int. Set variables i, j : Int; s : Int. Set Axioms:

Guidelines for Completeness □Classify methods: ◊ constructors: return Int. Set ◊ inspectors: take Int.

Add More Axioms □remove(new(), i) = new() □remove(insert(s, j), i) = if i =

Guidelines for Completeness □But does this really specify sets? Do the following properties hold?

Interface (Implementation) for UInt. Stack Class UInt. Stack { public UInt. Stack() {…}; public

Group Exercise: Write Parameterized Unit Tests (PUTs) Class UInt. Stack { Let’s copy it

Recall: Parameterized Unit Tests Supported by Pex/Pex 4 Fun using System; using Microsoft. Pex.

Force Pex/Pex 4 Fun to Display All Explored Test Inputs/Paths using System; using Microsoft.

Factory Method: Help Pex Generate Desirable Object States In class, we show the factory

Factory Method: Help Pex Generate Desirable Object States Below is a manually edited/created good

One Sample PUT Below is a manually edited/created good factory method to guide Pex

Pex 4 Fun Not Supporting Factory Method - Workaround If you try PUTs on

Guideline of Writing PUT • Setup: basic set up for invoking the method under

Setup • • • Select your method under test m Put its method call

Setup - Example [Pex. Method] public void Test. Push([Pex. Assume. Under. Test]UInt. Stack s,

Assert • Think about how you can assert the behavior • Do you need

Targets for Asserting • • • Return value of the method under test (MUT)

Cached Public Property Value • A property value before invoking MUT may need to

Argument of MUT • Argument value of MUT may be used Pattern 2. 3:

Reciever or Argument of Earlier Method • Receiver or argument value of a method

Observer Methods • Invoking observer methods on the modified object state Pattern 2. 6:

Observer Methods cont. • Forcing observer methods to return specific values (e. g. ,

Alternative Computation • Invoking another method/method sequence to produce a value to be used

Divide and Conquer • Split possible outcomes into cases (each with pre and post

Class Invariant Checker • If class invariant checker (rep. Ok) exists or you would

Other Patterns • • • Pattern 2. 9: Allowed exceptions • • [Pex. Allowed.

Test-Driven Development (TDD) □Basic Idea: ◊ Write tests before code ◊ Refine code with

Note: TDD and specifications □TDD encourages writing specifications before code ◊ Exemplary specification □Later,

Parameterized Test. Driven Development Write/refine Contract as PUT Bug in PUT Write/refine Code of

Coding Duels 1, 750, 069 clicked 'Ask Pex!'

Coding Duels t e r c se Pex computes “semantic diff” in cloud secret

Behind the Scene of Pex for Fun behavior Secret Impl == Player Impl Secret

Code Hunt Programming Game https: //www. codehunt. com/

It’s a game! iterative gameplay adaptive ret c personalized se no cheating clear winning

Slides: 116

Download presentation

Parameterized Unit Testing: Theory and Practice Tao Xie University of Illinois at Urbana-Champaign http: //taoxie. cs. illinois. edu/courses/testing/ Work described in the slides was done in collaboration with the Pex team (Nikolai Tillmann, Peli de Halleux, Pratap Lakshman, et al. ) @Microsoft Research, students @Illinois ASE, and other collaborators

Faults, Errors & Failures □ Fault : A static defect in the software (i. e. , defect, bug) □ Infected State: An incorrect internal state that is the manifestation of some fault (often also referred to as error) □ Software Failure : External, incorrect behavior with respect to the requirements or other description of the expected behavior © Ammann & Offutt

Mistake, Fault, Error, Failure Programmer makes a mistake. Fault (defect, bug) appears in the program. Fault remains undetected during testing (running test inputs). The program fails (based on test oracles) during execution i. e. it behaves unexpectedly. Error: difference between computed, observed, or measured value or condition and true, specified, or theoretically correct value or condition What does Bug mean in “Bug Report”?

What is fault, error, failure? □Doubling the balance and then plus 10 int cal. Amount () { int ret = balance * 3; ret = ret + 10; return ret; } void test. Cal. Amount() { Where is test input? Account a = new Account(); Where is test oracle? Account. set. Balance(0); int amount = Account. cal. Amount(); assert. True(amount == 10); 1 -4 }

What is fault, error, failure? □Doubling the balance and then plus 10 int cal. Amount () { int ret = balance * 3; ret = ret + 10; return ret; } void test. Cal. Amount() { Where is test input? Account a = new Account(); Where is test oracle? Account. set. Balance(1); int amount = Account. cal. Amount(); assert. True(amount == 12); 1 -5 }

What is fault, error, failure? □Should not allow withdrawal when there is a balance of 100 or less boolean do. Withdraw(int amount) { if (Balance<100) return false; else return wth. Draw(amount); } void test. With. Draw() { Account a = new Account(); Account. set. Balance(100); boolean success = Account. do. Withdraw(10); assert. True(success == false); } 1 -6

Who should test? □Developer? Separate “quality assurance” group? □Programmer? User? Someone with a degree in “testing”? 7

Types of Test Activities □Testing can be broken up into four general types of activities 1. 2. 3. 4. Test Design Test Automation Test Execution Test Evaluation □Each type of activity requires different skills, background knowledge, education and training © Ammann & Offutt

Test Design test values to satisfy coverage criteria or other engineering goal □ This is the most technical job in software testing □ Requires knowledge of : ◊ Discrete math ◊ Programming ◊ Testing □ Requires much of a traditional CS degree □ This is intellectually stimulating, rewarding, and challenging □ Test design is analogous to software architecture on the development side □ Using people who are not qualified to design tests is a sure way to get ineffective tests © Ammann & Offutt

Test Automation Embed test values into executable scripts □ This is slightly less technical □ Requires knowledge of programming ◊ Fairly straightforward programming – small pieces and simple algorithms □ Requires very little theory □ Very boring for test designers □ Programming is out of reach for many domain experts □ Who is responsible for determining and embedding the expected outputs ? ◊ Test designers may not always know the expected outputs ◊ Test evaluators need to get involved early to help with this © Ammann & Offutt

Test Execution Run tests on the software and record the results □This is easy – and trivial if the tests are well automated □Requires basic computer skills ◊ Interns ◊ Employees with no technical background □Asking qualified test designers to execute tests is a sure way to convince them to look for a development job □If, for example, GUI tests are not well automated, this requires a lot of manual labor ◊ Test executors have to be very careful and meticulous with bookkeeping © Ammann & Offutt

Test Evaluation Evaluate results of testing, report to developers □ This is much harder than it may seem □ Requires knowledge of : ◊ Domain ◊ Testing □ Usually requires almost no traditional CS ◊ A background in the domain of the software is essential ◊ An empirical background is very helpful (biology, psychology, …) ◊ A logic background is very helpful (law, philosophy, math, …) □ This is intellectually stimulating, rewarding, and challenging ◊ But not to typical CS majors – they want to solve problems and build things © Ammann & Offutt

Types of Test Activities – Summary 1. Test design Design test values to satisfy coverage criteria or other engineering goal Requires technical knowledge of discrete math, programming and testing 2. Test automation Embed test values into executable scripts Requires knowledge of scripting 3. Test execution Run tests on the software and record the results Requires very little knowledge 4. Test evaluation Evaluate results of testing, report to developers Requires domain knowledge □ These four general test activities are quite different But most test organizations use the same people □ It is a poor use of resources to use people for ALL FOUR activities !! inappropriately © Ammann & Offutt

Testing Model – Black Box Testing □You know the functionality ◊ Given that you know what it is supposed to do, you design tests ◊ ◊ that make it do what you think that it should do From the outside, you are testing its functionality against the specs For software, this is testing the interface ○ What is input to the system? ○ What you can do from the outside to change the system? (controllability) ○ What is output from the system? (observability) ◊ Impossible to thoroughly exercise all inputs ○ Exhaustive testing grows without bound ◊ Tests the functionality of the system by observing its external ◊ ©L. Williams behavior No knowledge of how it goes about meeting the goals

Testing Model – White Box Testing □You know the code ◊ Given knowledge of the internal workings, you thoroughly test ◊ ◊ what is happening on the inside Close examination of procedural level of detail Logical paths through code are tested ○ Conditionals ○ Loops ○ Branches ◊ Status is examined in terms of expected values ◊ Impossible to thoroughly exercise all paths ○ Exhaustive testing grows without bound ◊ Can be practical if a limited number of “important” paths are ◊ ©L. Williams evaluated Can be practical to examine and test important data structures

Group Exercise Ø A program needs to be developed so that given an integer value Ø it outputs 0 when the integer value is 0 Ø it outputs 1 when the integer value > 0 Ø It outputs -1 when the integer value < 0 Ø What would be your black box tests? Ø How would you generate your white box tests? Ø Would black box tests alone be good enough to find bugs/faults in the program? Why? Ø Would white box tests alone be good enough be find bugs/faults in the program? Why?

Black-box vs. White-box □White-box - look at code to write test ◊ Tests are based on code ◊ Better for finding crashes, out of bounds failures, file not closed failures ◊ Better at finding faults of extra logic □Black-box - don’t look at code to write test ◊ Tests are based on specifications ◊ Better at telling if program meets spec ◊ Better at finding faults of omission 17

Types of Testing Ø Unit Testing (white) ◊ testing of individual hardware or software units or groups of related ◊ ◊ ◊ units Done by programmer(s) Generally all white box Automation desirable for repeatability Ø Integration Testing (black and white) ◊ testing in which software components, hardware components, or ◊ ◊ ◊ ©L. Williams both are combined and tested to evaluate the interaction between them Done by programmer as they integrate their code into code base Generally white box, maybe some black box Automation desirable for repeatability

Types of Testing -II Ø Functional/System Testing (black) ◊ testing conducted on a complete, integrated system to evaluate the system compliance with its specified requirements ◊ stress testing, performance testing, usability testing ◊ it is recommended that this be done by external test group ◊ mostly black box so that testing is not ‘corrupted’ by too much knowledge ◊ test automation desirable Ø Acceptance Testing (black) ◊ formal testing conducted to determine whether or not a system satisfies its acceptance criteria (the criteria the system must satisfy to be accepted by a customer) and to enable the customer to determine whether or not to accept the system ◊ Generally done by customer/customer representative in their environment through the GUI. . . Definitely black box ©L. Williams

Types of Testing -III Ø Regression Testing (black and white) ◊ Regression testing is selective retesting of a system or component to verify that modifications have not caused unintended effects and that the system or component still complies with its specified requirements ◊ Smoke test group of test cases that establish that the system is stable and all major functionality is present and works under “normal” conditions Ø Beta Testing (black) ◊ (1~many) potential users or beta testers install software and use it as they wish and report any revealed errors to the development organization. □ A/B Testing ◊ https: //en. wikipedia. org/wiki/A/B_testing ©L. Williams

Techniques for writing tests □Black-box (from specifications) ◊ Equivalence partitioning ◊ Boundary value analysis □White-box (from code) ◊ Branch coverage □Fault-based testing (from common errors) ◊ http: //www. exampler. com/testingcom/writings. html from Brian Marick 21

Planning a Black Box Test Case ©L. Williams

Important Consideration for Black Box Test Planning □ Look at requirements/problem statement to generate. □ Test cases need to be traceable to a requirement. □ You must write the repeatable test case so anyone on the team can run the exact test case and get the exact same result/sequence of events. ◊ The inputs must be very specific. ○ Example: “Students who receive a grade of 70 or higher ◊ ©L. Williams pass the exam. ” ○ Correct test cases: Grade = 80; Grade =20 ○ Incorrect test cases: “input a passing grade” “input a failing grade” The expected results must be very specific. “Pass” “Fail”

Equivalence Class Partitioning □ Divide your input conditions into groups (classes). ◊ Input in the same class should behave similarly in the program. □ Be sure to test a mid-range value from each class. □ Example: for tests of “Go to Jail” the most important thing is whether the player has enough money to pay the $50 fine ◊ Test input values clearly in the two partitions: 25 and 75. ©L. Williams

Equivalence partitioning □ Example: sorting □ sort(array, len) takes an array of integers of length len and sorts them in ascending order, i. e. permutes them so that each element of the array is less than or equal to the succeeding one. □ len = 0, 1, 2, 17 □ Array is already sorted, has duplicates, has negative numbers 25

Equivalence class test ideas □Any object: the null pointer □Strings: the empty string □Collections: ◊ The empty collection ◊ Contains exactly one element ◊ Contains the maximum number of elements (or at least more than one) 26

Equivalence class test ideas □Linked structures (trees, queues, etc. ) ◊ Empty ◊ Minimal but non-empty ◊ Circular ◊ Depth greater than one (or maximally deep) □Equality comparison of objects ◊ Equal but not identical ◊ Different at lowest level, the same at upper 27

Boundary Value Analysis □Focus on boundaries. . . because a greater number of faults tend to occur at the boundaries of the input domain ◊ Range input, a to b, test with a, b, a-1, a+1, b-1, b+1 if ◊ integer range; otherwise, slightly less than a and slightly more than b. If you can only have a certain quantity (q) of something, try to create q-1, q, q+1 ©L. Williams

Decision Table Testing ©L. Williams

Monopoly Decision Table □ If a Player (A) lands on property owned by another player (B), A must pay rent to B. If A does not have enough money to pay B, A is out of the game. ©L. Williams

Dirty/Failure Test Cases □ Can something cause division by zero? □ What if the input type is wrong (You’re expecting an integer, they input a float. You’re expecting a character, you get an integer. )? □ What if the customer takes an illogical path through your functionality? □ What if mandatory fields are not entered? □ What if the program is aborted abruptly or input or output devices are unplugged? Robustness Testing ©L. Williams

Coverage □Make sure tests cover each part of program ◊ Every statement ◊ Every branch ◊ Every condition ◊ Every pass through a loop ◊ Every path(? ) □Measures the quality of tests □How much of the program do the tests cover? 33

Coverage tools □Tool “instruments” the program □You run your tests, it builds database □Tool looks at database to see which parts of the program were executed, and reports test coverage □Some Java open source tools: Ecl. Emma, Quilt, No. Unit, Ins. ECT, Jester, jcoverage, Coverlispe, Hansel… 34

White-box tests □Purpose: exercise all the code □Large number - take a long time to write □Good for finding run-time errors ◊ Null object, array-bounds error □In practice, coverage is better for evaluating tests than for creating them 35

White Box Testing - Review □You know the code ◊ Given knowledge of the internal workings, you thoroughly test ◊ ◊ what is happening on the inside Close examination of procedural level of detail Logical paths through code are tested ○ Conditionals ○ Loops ○ Branches ◊ Status is examined in terms of expected values ◊ Impossible to thoroughly exercise all paths ○ Exhaustive testing grows without bound ◊ Can be practical if a limited number of “important” paths are ◊ ©L. Williams evaluated Can be practical to examine and test important data structures

Devising a prudent set of test cases □Equivalence Class/Boundary Value Analysis ◊ Still applies! □A metric for assessing how good your test suite is ◊ Method Coverage ◊ Statement Coverage ◊ Decision/Branch Coverage ◊ Condition Coverage □Think diabolically ©L. Williams

Recall: Mistake, Fault, Error, Failure Programmer makes a mistake. Fault (defect, bug) appears in the program. Fault remains undetected during testing (running test inputs). The program fails (based on test oracles) during execution i. e. it behaves unexpectedly. Error: difference between computed, observed, or measured value or condition and true, specified, or theoretically correct value or condition What does Bug mean in “Bug Report”?

Fault & Failure Model Three conditions necessary for a failure to be observed 1. Execution/Reachability : The location or locations in the program that contain the fault must be reached 2. Infection : The state of the program must be incorrect 3. Propagation : The infected state must propagate to cause some output of the program to be incorrect PIE model 39 © Ammann & Offutt

Conversation 1: Tester Test Manager □Test Manager: Looks like the code under test is not achieving high statement coverage. Please work hard to achieve high statement coverage. □Tester: Hmm… boss, our goal is to detect faults. I don’t think I need to spend more efforts to achieve high statement coverge. □Test Manager: Well, according to the PIE model, …. . [You fill in here] 41

Conversation 2: Tester Test Manager □Tester: Boss, following your command, I work very hard and I have already achieved 100% statement coverage! I would like to take a vacation in Hawaii. Could you approve? □Test Manager: Well, according to the PIE model, …. . [You fill in here] 42

Testing & Debugging □Testing : Finding inputs that cause the software to fail □Debugging : The process of finding a fault given a failure 43

Test Case □Test Case Values/Test Input/Test Data : The values that directly satisfy one test requirement □Expected Results : The result that will be produced when executing the test if the program satisfies it intended behavior ◊ Related Term: Test Oracles □Tests can mean different things in different contexts 44 © Ammann & Offutt

Observability and Controllability □ Software Observability : How easy it is to observe the behavior of a program in terms of its outputs, effects on the environment and other hardware and software components ◊ Software that affects hardware devices, databases, or remote files have low observability □ Software Controllability : How easy it is to provide a program with the needed inputs, in terms of values, operations, and behaviors ◊ Easy to control software with inputs from keyboards ◊ Inputs from hardware sensors or distributed software is harder ◊ Data abstraction reduces controllability and observability 45 © Ammann & Offutt

What is a Unit Test? A unit test is a small program with assertions. [Test. Method] public void Add() { Hash. Set set = new Hash. Set(); set. Add(3); set. Add(14); Assert. Are. Equal(set. Count, 2); } Many developers write such unit tests by hand. This involves □ determining a meaningful sequence of method calls, □ selecting exemplary argument values (the test inputs), □ stating assertions. 46

Unit Testing: Benefits □Design and specification ◊ by example □Code coverage and regression testing ◊ confidence in correctness ◊ preserving behavior □Short feedback loop ◊ unit tests exercise little code ◊ failures are easy to debug □Documentation

Unit Testing: Measuring Quality □ Coverage: Are all parts of the program exercised? ◊ statements ◊ basic blocks ◊ explicit/implicit branches ◊… □ Assertions: Does the program do the right thing? ◊ test oracle Experience: □ Just high coverage or large number of assertions is no good quality indicator. □ Only both together are!

Advantages of tests as specs □Concrete, easy to understand □Don’t need new language □Easy to see if program meets the spec □Making tests forces you to talk to customer and learn the problem □Making tests forces you to think about design of system (classes, methods, etc. ) 49

Disadvantages of tests as specs □Too specific □Hard to test that something can’t happen ◊ Can’t withdraw more money than you have in the system ◊ Can’t break into the system ◊ Can’t cause a very long transaction that hangs the system □Tends to be verbose 50

Tests as specifications □Tests show to use the system □Tests need to be readable ◊ Need comments that describe their purpose or need good names ◊ Keep short, delete duplicate or redundant 51

Parameterized Unit Test □ A parameterized unit test is a small program that takes some inputs and states assumptions and assertions. Parameterized Unit Test JUnit: @Theory (multiple parameters) @Parameters (single parameter) 52

Parameterized Unit Testing Parameterized Unit Tests □ serve as specifications □ can be leveraged by (automatic) test input generators □ fit in development environment, evolve with the code

Test Generation Process x. Unit Attributes Pex Attributes // Foo. Test. cs [Test. Class, Pex. Class] Partial Class partial class Foo. Test { [Pex. Method] Pex void Test(Foo foo) {…} Parameterized Unit Test Hand-written • User writes parameterized tests • Lives inside a test class Generated // Foo. Test. cs partial class Foo. Test { [Test. Method] void Test_1() { this. Test(new Foo(1)); } [Test. Method] void Test_1() { this. Test(new Foo(2)); } … } • Generated unit tests • Pex not required for re-execution • x. Unit unit tests http: //msdn. microsoft. com/en-us/library/wa 80 x 488(VS. 80). aspx 54

PUTs separate concerns PUTs separate two concerns: (1) The specification of external behavior (i. e. , assertions) (2) The selection of internal test inputs (i. e. , coverage) In many cases, a test generation tool (e. g. , Pex) can construct a small test suite with high coverage !

PUTs are algebraic specs A PUT can be read as a universally quantified, conditional axiom. int name, int data. name ≠ null ⋀ data ≠ null ⇒ equals( Read. Resource(name, Write. Resource(name, data)), data)

Pex 4 Fun – Turning Pex Online http: //pex 4 fun. com/default. aspx? language=CSharp&sample=_Template 1, 750, 069 clicked 'Ask Pex!'

Behind the Scene of Pex 4 Fun Dynamic Symbolic Execution (DSE) aka. Concolic Testing [Godefroid et al. 05][Sen et al. 05][Tillmann et al. 08] Instrument code to explore feasible paths http: //research. microsoft. com/pex/

http: //research. microsoft. com/pex/

Walkthrough: Unit Testing in VS □ Create Project □ Create Test Class □ Create Tests ◊ passing, failing, expected to fail □ Run Tests □ View Coverage Note: Other unit test frameworks exist for. Net, e. g. Nunit Use [Pex. Method(Test. Emission. Filter=Pex. Test. Emission. Filter. All)] to force generation/displaying of all explored test data

Dynamic Symbolic Execution in Pex Choose next path Solve void Cover. Me(int[] a) { if (a == null) return; if (a. Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } F F a. Length>0 a==null T T Execute&Monitor Constraints to solve Input Observed constraints a!=null {} a==null a!=null && !(a. Length>0) a==null && a. Length>0 && a[0]!=1234567890 a==null && a. Length>0 && a[0]==1234567890 a!=null && a. Length>0 {0} a!=null && a. Length>0 && a[0]==123456890 {123…} Done: There is no path left. a[0]==123… F T http: //pex 4 fun. com/How. Does. Pex. Work

Pex is Part of Visual Studio 2015 Eneterprise Edition! □As new feature of “Intelli. Test” https: //www. visualstudio. com/news/vs 2015 -vs#Testing

Domain Matrix for Testing Complex Condition 63

Guide Pex to Generate Test Data //[Pex. Method(Test. Emission. Filter = Pex. Test. Emission. Filter. All)] [Pex. Method] public void Test. Boundary. Values. And. Input. Partition(int x, int y) { //boundary values/partitions for x > 0 && x <= 10 && y >= 1 Pex. Assume. Is. True((x > 0)); if (x == 1) { } else if (x > 0) { } Pex. Assume. Is. True((x <= 10)); if (x == 10) { } else if (x <= 10) { } Pex. Assume. Is. True((y >= 1)); if (y == 1) { } else if (y > 1) { } } Details see http: //taoxie. cs. illinois. edu/publications/icsm 10 -coverage. pdf 64

Interface for Int. Set Class Int. Set { public Int. Set() {…}; public void insert(int e) { … } public Bool member(int e) { … } public void remove(int e) { … } } sort Int. Set imports Int, Bool signatures new : -> Int. Set insert : Int. Set × Int -> Int. Set member : Int. Set × Int -> Bool remove : Int. Set × Int -> Int. Set http: //www. cs. unc. edu/~stotts/723/adt. html 66

(Buggy) Implementation for Int. Set Class Int. Set { public Int. Set() {…}; public void insert(int e) { … } public Bool member(int e) { … } public void remove(int e) { … } } See the Set. cs that can be downloaded from http: //taoxie. cs. illinois. edu/courses/testing/Set. cs Let’s copy it to http: //pex 4 fun. com/default. aspx? language=CSharp&sample=_Template And Click “Ask Pex” 67

Parameterized Unit Tests Supported by Pex/Pex 4 Fun using System; using Microsoft. Pex. Framework. Settings; [Pex. Class] public class Set { [Pex. Method] public static void test. Member. After. Insert. Not. Equal(Set s, int i, int j) { Pex. Assume. Is. True(s != null); Pex. Assume. Is. True(i != j); bool exist. Old = s. member(i); s. insert(j); bool exist = s. member(i); Pex. Assert. Is. True(exist. Old == exist); } …. Pex 4 Fun supports only one Pex. Method at a time; } you can write multiple Pex. Methods but comment out other lines of “[Pex. Method]” except one 68

Axioms for Int. Set variables i, j : Int; s : Int. Set Axioms: member(new(), i) = false member(insert(s, j), i) = if i = j then true else member(s, i) Is this complete? How do we know? http: //www. cs. unc. edu/~stotts/723/adt. html 69

Guidelines for Completeness □Classify methods: ◊ constructors: return Int. Set ◊ inspectors: take Int. Set as argument, returning some other value. □Identify key constructors, capable of constructing all possible object states ◊ e. g. , insert, new. □Identify others as auxiliary, ◊ e. g. , remove is a destructive constructor □Completeness requires (at least): ◊ every inspector/auxiliary constructor is defined by one equation for each key constructor. 70

Add More Axioms □remove(new(), i) = new() □remove(insert(s, j), i) = if i = j then remove(s, i) else insert(remove(s, i), j) Are we done yet? The completeness criterion (an equation defining member and remove for each of the new and insert constructors) is satisfied. 71

Guidelines for Completeness □But does this really specify sets? Do the following properties hold? □Order of insertion is irrelevant. ◊ insert(s, i), j) = insert(s, j), i) □Multiple insertion is irrelevant. ◊ insert(s, i) = insert(s, i) 72

Interface (Implementation) for UInt. Stack Class UInt. Stack { public UInt. Stack() {…}; public void Push(int k) { … } public void Pop() { … } public int Top() { … } public bool Is. Empty() { … } public int Max. Size() { … } public bool Is. Member(int k) { … } public bool Equals(UInt. Stack s) { … } public int Get. Number. Of. Elements() { … } public bool Is. Full() { … } } See the UInt. Stack. cs that can be downloaded from http: //taoxie. cs. illinois. edu/courses/testing/UInt. Stack. cs 73

Group Exercise: Write Parameterized Unit Tests (PUTs) Class UInt. Stack { Let’s copy it to http: //pex 4 fun. com/default. aspx? language= public UInt. Stack() {…}; public void Push(int k) { … } CSharp&sample=_Template And Click “Ask Pex” public void Pop() { … } public int Top() { … } Reminder: you have to public bool Is. Empty() { … } comment earlier written “[Pex. Method]” before public int Max. Size() { … } you try Pex on your public bool Is. Member(int k) { … } current PUT (Pex 4 Fun can handle only one public bool Equals(UInt. Stack s) { … } PUT at a time) public int Get. Number. Of. Elements() { … } public bool Is. Full() { … } } See the UInt. Stack. cs that can be downloaded from http: //taoxie. cs. illinois. edu/courses/testing/UInt. Stack. cs 74

Recall: Parameterized Unit Tests Supported by Pex/Pex 4 Fun using System; using Microsoft. Pex. Framework. Settings; [Pex. Class] public class Set { [Pex. Method] public static void test. Member. After. Insert. Not. Equal(Set s, int i, int j) { Pex. Assume. Is. True(s != null); Pex. Assume. Is. True(i != j); bool exist. Old = s. member(i); s. insert(j); bool exist = s. member(i); Pex. Assert. Is. True(exist. Old == exist); } …. } 76

Force Pex/Pex 4 Fun to Display All Explored Test Inputs/Paths using System; using Microsoft. Pex. Framework. Settings; [Pex. Class] public class Set { [Pex. Method(Test. Emission. Filter=Pex. Test. Emission. Filter. All)] public static void test. Member. After. Insert. Not. Equal(Set s, int i, int j) { Pex. Assume. Is. True(s != null); Pex. Assume. Is. True(i != j); bool exist = s. member(i); s. insert(j); Pex. Assert. Is. True(exist); } …. } 77

Factory Method: Help Pex Generate Desirable Object States In class, we show the factory method as below automatically synthesized by Pex after a user clicks “ 1 Object Creation” issue and then click “Accept/Edit Factory Method”. But it is not good enough to generate various types of object states. [Pex. Factory. Method(typeof(UInt. Stack))] public static UInt. Stack Create(int k_i) { UInt. Stack u. Int. Stack = new UInt. Stack(); u. Int. Stack. Push(k_i); return u. Int. Stack; // TODO: Edit factory method of UInt. Stack // This method should be able to configure the object in all possible ways. // Add as many parameters as needed, // and assign their values to each field by using the API. } 78

Factory Method: Help Pex Generate Desirable Object States Below is a manually edited/created good factory method to guide Pex to generate various types of object states. Note that Pex also generates argument values for the factory method. [Pex. Factory. Method(typeof(UInt. Stack))] public static UInt. Stack Create. Varied. Size. Any. Elems. Stack(int[] elems) { Pex. Assume. Is. Not. Null(elems); UInt. Stack s = new UInt. Stack(); Pex. Assume. Is. True(elems. Length <= (s. Max. Size() + 1)); for (int i = 0; i < elems. Length; i++) s. Push(elems[i]); return s; } 79

One Sample PUT Below is a manually edited/created good factory method to guide Pex to generate various types of object states. Note that Pex also generates argument values for the factory method. [Pex. Method] public void Test. Push([Pex. Assume. Under. Test]UInt. Stack s, int i) { //UInt. Stack s = new UInt. Stack(); Pex. Assume. Is. True(!s. Is. Member(i)); int old. Count = s. Get. Number. Of. Elements(); s. Push(i); Pex. Assert. Is. True(s. Top() == i); Pex. Assert. Is. True(s. Get. Number. Of. Elements() == old. Count+1); Pex. Assert. Is. False(s. Is. Empty()); } 80

Pex 4 Fun Not Supporting Factory Method - Workaround If you try PUTs on Pex 4 Fun, which doesn’t support factory method, you can “embed” the factory method like the highlighted code portion below [Pex. Method] public void Test. Push(int[] elems, int i) { Pex. Assume. Is. Not. Null(elems); UInt. Stack s = new UInt. Stack(); Pex. Assume. Is. True(elems. Length <= (s. Max. Size() + 1)); for (int i = 0; i < elems. Length; i++) s. Push(elems[i]); //UInt. Stack s = new UInt. Stack(); Pex. Assume. Is. True(!s. Is. Member(i)); int old. Count = s. Get. Number. Of. Elements(); s. Push(i); Pex. Assert. Is. True(s. Top() == i); Pex. Assert. Is. True(s. Get. Number. Of. Elements() == old. Count+1); Pex. Assert. Is. False(s. Is. Empty()); } 81

Guideline of Writing PUT • Setup: basic set up for invoking the method under test • Checkpoint: Run Pex to make sure that you don't • miss any Pex assumptions (preconditions) for the PUT Assert: add assertions for asserting behavior of the method under test, involving • Adding Pex assertions • Adding Pex assumptions for helping assert • Adding method sequences for helping assert

Setup • • • Select your method under test m Put its method call in your PUT Create a parameter for your PUT as the class under test c (annotated it with [Pex. Assume. Under. Test]) Create other parameters for your PUT for parameters of m if any Add Pex assumptions for preconditions for all these parameters of PUT if any

Setup - Example [Pex. Method] public void Test. Push([Pex. Assume. Under. Test]UInt. Stack s, int i) { s. Push(i); } You may write your factory method to help Pex in test generation If you get exceptions thrown • if indicating program faults, fix them • If indicating lack of PUT assumptions, add PUT assumptions • If indicating insufficient factory method assumptions or inappropriate scenarios, add PUT assumptions or improve factory method.

Assert • Think about how you can assert the behavior • Do you need to invoke other (observer) helper • • methods in your assertions (besides asserting return values)? Do you need to add assumptions so that your assertions can be valid? Do you need to add some method sequence before the method under test to set up desirable state and cache values to be used in the assertions?

Targets for Asserting • • • Return value of the method under test (MUT) Argument object of MUT Receiver object properties being modified by MUT (if public fields, directly assertable) • How to assert them? • Think about the intended behavior! • If you couldn't do so easily, follow the guidelines discussed next

Cached Public Property Value • A property value before invoking MUT may need to be cached and later used. Pattern 2. 1/2. 2: Assume, Arrange, Act, Assert [Pex. Method] void Assume. Act. Assert(Array. List list, object item) { Pex. Assume. Is. Not. Null(list); // assume var count = list. Count; // arrange list. Add(item); // act Assert. Is. True(list. Count == count + 1); // assert }

Argument of MUT • Argument value of MUT may be used Pattern 2. 3: Constructor Test [Pex. Method] void Constructor(int capacity) { var list = new Array. List(capacity); // create Assert. Invariant(list); // assert invariant Assert. Are. Equal(capacity, list. Capacity); // assert }

Reciever or Argument of Earlier Method • Receiver or argument value of a method before invoking MUT value s parsed Pattern 2. 4/5: Roundtrip [Pex. Method] void To. String. Parse. Roundtrip(int value) { // two-way roundtrip string s = value. To. String(); int parsed = int. Parse(s); // assert Assert. Are. Equal(value, parsed); }

Observer Methods • Invoking observer methods on the modified object state Pattern 2. 6: State Relation [Pex. Method] void Insert. Contains(string value) { var list = new List<string>(); list. Add(value); Assert. Is. True(list. Contains(value)); } Each modified object property should be read by at least one observer method.

Observer Methods cont. • Forcing observer methods to return specific values (e. g. , true or false) can force you to add specific assumptions or scenarios [Pex. Method] void Push. Is. Full([Pex. Assume. Under. Test]UInt. Stack s, int value) { Pex. Assume. Is. True(s. Get. Size() == (s. Get. Max. Size()-1)); s. Push (value); Assert. Is. True(s. Is. Full ()); }

Alternative Computation • Invoking another method/method sequence to produce a value to be used Pattern 2. 7: Commutative Diagram [Pex. Method] void Commutative. Diagram 1(int x, int y) { // compute result in one way string z 1 = Multiply(x, y). To. String(); // compute result in another way string z 2 = Multiply(x. To. String(), y. To. String()); // assert equality if we get here Pex. Assert. Are. Equal(z 1, z 2); }

Divide and Conquer • Split possible outcomes into cases (each with pre and post condition) Pattern 2. 8: Cases [Pex. Method] void Business. Rules(int age, Job job) { var salary = Salary. Manager. Compute. Salary(age, job); Pex. Assert. Case(age < 30). Implies(() => salary < 10000). Case(job == Job. Manager && age > 35). Implies(() => salary > 10000). Case(job == Job. Manager && age < 20). Implies(() => false); }

Class Invariant Checker • If class invariant checker (rep. Ok) exists or you would be willing to write one, use it to assert Pattern 2. 3: Constructor Test [Pex. Method] void Constructor(int capacity) { var list = new Array. List(capacity); // create Assert. Invariant(list); // assert invariant Assert. Are. Equal(capacity, list. Capacity); // assert }

Other Patterns • • • Pattern 2. 9: Allowed exceptions • • [Pex. Allowed. Exception(typeof(Argument. Null. Exception))] [Expected. Exception(typeof(Argument. Null. Exception))] Pattern 2. 10: Reachability • [Pex. Expected. Goals] + throw new Pex. Goal. Exception(); Pattern 2. 11: Parameterized Stub • • • No scenarios or assertions Pattern 2. 12: Input Output Test void Add(int a, int b, out int result) { … } int Substract(int a, int b) { … } Pattern 2. 13/14: Regression Tests bool Parse(string input) { … } Pex. Store. Value. For. Validation("result", result); http: //research. microsoft. com/en-us/projects/pex/patterns. pdf

Test-Driven Development (TDD) □Basic Idea: ◊ Write tests before code ◊ Refine code with new tests □In more detail, TDD is a cycle of steps: ◊ Add a test, ◊ Run it and watch it fail, ◊ Change the code as little as possible such that the test should pass, ◊ Run the test again and see it succeed, ◊ Refactor the code if needed.

Note: TDD and specifications □TDD encourages writing specifications before code ◊ Exemplary specification □Later, we will generalize TDD to Parameterized TDD ◊ Axiomatic specifications

Parameterized Test. Driven Development Write/refine Contract as PUT Bug in PUT Write/refine Code of Implementation Bug in Code Run Pex no failures Use Generated Tests for Regression failures Fix-it (with Pex), Debug with generated tests

Coding Duels 1, 750, 069 clicked 'Ask Pex!'

Coding Duels t e r c se Pex computes “semantic diff” in cloud secret reference implementation vs. code written in browser You win when Pex finds no differences For more info, see our ICSE 2013 SEE paper: http: //taoxie. cs. illinois. edu/publications/icse 13 see-pex 4 fun. pdf

Behind the Scene of Pex for Fun behavior Secret Impl == Player Impl Secret Implementation class Secret { public static int Puzzle(int x) { if (x <= 0) return 1; return x * Puzzle(x-1); } } 1, 594, 092 class Test { public static void Driver(int x) { if (Secret. Puzzle(x) != Player. Puzzle(x)) throw new Exception(“Mismatch”); } } Player Implementation class Player { public static int Puzzle(int x) { return x; } } 102

Code Hunt Programming Game https: //www. codehunt. com/

Code Hunt Programming Game

It’s a game! iterative gameplay adaptive ret c personalized se no cheating clear winning criterion code test cases