ECE 453 CS 447 SE 465 Software Testing

  • Slides: 27
Download presentation
ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor

ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis 1

Overview èSoftware Reverse Engineering èDefinitions èProgram Understanding èPlan Recognition 2

Overview èSoftware Reverse Engineering èDefinitions èProgram Understanding èPlan Recognition 2

Reverse Engineering • “Reverse engineering” term derived from hardware development – the process of

Reverse Engineering • “Reverse engineering” term derived from hardware development – the process of discovering of how competitor’s system worked. – in software engineering, • it is the process of discovering how your own system works. • Software systems become difficult to understand maintain since over time their size and complexity has had a continuous evolution. 3

Reverse Engineering • Reverse engineering is usually applied to large legacy systems: – to

Reverse Engineering • Reverse engineering is usually applied to large legacy systems: – to make them easier to understand maintain – to increase the potential for continued evolution. • Often, the most fundamental reverse engineering reason is: – structural re-documentation. • the structure of the system is derived with some of the design architecture recaptured. 4

Reverse Engineering Terminology • Design recovery – is a subset of reverse engineering in

Reverse Engineering Terminology • Design recovery – is a subset of reverse engineering in which domain knowledge, external information, and deduction or fuzzy reasoning are added to the observations of the subject system to identify meaningful higher level abstractions. Design recovery recreates design abstractions from a combination of code, existing design documentation (if available), personal experience, and general knowledge about problem and application domains. • Redocumentation – is the creation or revision of a semantically equivalent representation within the same relative abstraction level. . Redocumentation is the simplest and oldest form of reverse engineering, and can be considered to be an unintrusive, weak form of restructuring. 5

Reverse Engineering • forward and reverse engineering can be illustrated as: Specifications Design Code

Reverse Engineering • forward and reverse engineering can be illustrated as: Specifications Design Code Behavior Forward Engineering Reverse Engineering 6

Reverse Engineering Two distinct phases 1. identify the system’s components and any dependencies among

Reverse Engineering Two distinct phases 1. identify the system’s components and any dependencies among them 2. a discovery phase which tends to be highly interactive and may involve: • • • constructing the hierarchical subsystem components based on cohesion and coupling principles, the reconstruction of design and requirements specifications providing a ‘domain model’ and the matching of the model to the code. 7

Reverse Engineering • Reverse engineering tends to be influenced heavily by the amount of

Reverse Engineering • Reverse engineering tends to be influenced heavily by the amount of domain knowledge available: – limits the degree of automation that is possible – limits the level of abstraction obtained. • The uncovering of entities allows them to be classified and to determine shared properties and relationship attributes. 8

Reverse Engineering • The concepts of aggregation can be applied to determine the part-of

Reverse Engineering • The concepts of aggregation can be applied to determine the part-of relationship between a composite and its constituents. • Generalization and specialization allows an element to be related to a more general or specific element. • Possible to apply grouping to form a set of elements and their necessary relationships to form a context. 9

Reverse Engineering • various activities are performed during reverse engineering: – gathering (identifying) the

Reverse Engineering • various activities are performed during reverse engineering: – gathering (identifying) the software artifacts usually obtained from: • • • specification/design documents, the code, any related documentation, application knowledge and syntactic pattern matching to identify program (functional) ‘units’. 10

Reverse Engineering – creating the repository of information: • filter out immaterial information while

Reverse Engineering – creating the repository of information: • filter out immaterial information while selecting relevant information. – construct the abstraction layers at: • the structural, • functional and • application levels. – May need to perform semantic and behavioral matching during this process. 11

Reverse Engineering • the software system can be reasoned about in may different views:

Reverse Engineering • the software system can be reasoned about in may different views: – structural view: the basis is from structure charts, call graphs (unit interaction), module and subsystem graphs, various metrics and organizational views [many can be constructed with CASE tools]. – functional view: usually can be obtained from the design, specification and requirements documents. – behavioral views: conceptual, temporal, process, domain and user interactive views. 12

Program Understanding • A program is understood : – when it is possible to

Program Understanding • A program is understood : – when it is possible to explain the program, its structure, behavior, how/what it effects in its operational context and its relationships to its operational domain. • for large legacy systems, the program understanding phase is rather difficult, 13

Program Understanding • human-oriented concepts are generally decoupled from the formal patterns of their

Program Understanding • human-oriented concepts are generally decoupled from the formal patterns of their algorithms – they involve an arbitrary semantic mapping from their operations on numbers and data to computational intentions based on their domain concepts. • Automatic program analysis is usually quite limited in knowledge acquisition and the concept matching process. 14

Program Understanding • Some general tool directions include: – Parsing and text analysis –

Program Understanding • Some general tool directions include: – Parsing and text analysis – Flow analysis (call graphs, control flow graphs, etc. ) – Complexity analysis and anomaly detection (complexity measures, dataflow analysis, etc. ) – Program segmentation (different slicing techniques to isolate behavior). 15

Program Segmentation • Try to isolate areas of the implementation such that program understanding

Program Segmentation • Try to isolate areas of the implementation such that program understanding can be constrained to these segments which potentially implement the desired program behavior under consideration. • Program slicing techniques can be applied at the source code level to isolate or highlight different behavioral properties of the program. 16

Program Segmentation • condition-based slice: in many cases, programs are structured along conditional tests.

Program Segmentation • condition-based slice: in many cases, programs are structured along conditional tests. – for example, we could find all program slices for which the condition phone_off-hook or phone_ringing , etc. are true. 17

Program Segmentation • in an accounting program, the tax paid and the collection method

Program Segmentation • in an accounting program, the tax paid and the collection method may be dependent on the province and tax_payable. • Thus it may be desirable to locate areas of the code that are reachable under the globally specified condition that province=Ontario and tax=payable. • In this case the user specifies the logical expression and optionally a slicing range (where to start and end slicing). • Then all reachable flow paths for which the logical 18 expression is true are found for examination.

Program Segmentation • forward slice: many functions base computations on the values of the

Program Segmentation • forward slice: many functions base computations on the values of the input variables. – Given a variable and the slicing range, it is possible to determine all statements which can be potentially affected by that variable – Similar to using dataflow techniques – This process tends to be recursive in nature since all variables in left-hand side of an included statement are repeatedly used as slicing variables. 19

Program Segmentation • backward slice: basically, the classical interpretation of a slice: – those

Program Segmentation • backward slice: basically, the classical interpretation of a slice: – those statements that can affect the value of a variable (produce some result). – This process is also recursive in nature. 20

Program Segmentation • event-based slices: – for a given input event to the system,

Program Segmentation • event-based slices: – for a given input event to the system, obtain the program segment which can be executed based on the occurrence of the event. – Slice can be obtained for input or output events (program segment(s) that could potentially be executed to generate the specific output event). – This is often useful in object-oriented implementations providing a list of objects and 21 methods involved.

Recognizing Plans • we define a cliché as a frequently occurring pattern found in

Recognizing Plans • we define a cliché as a frequently occurring pattern found in programs (e. g. an algorithm, some domain specific pattern or data structure). • we define a plan as a representation of a cliché – e. g. using flow graphs, source code templates, and sets of logical constraints. – Then an understanding problem may be to locate the clichés using plans. 22

Recognizing Plans • the plans can be viewed as describing design elements using common

Recognizing Plans • the plans can be viewed as describing design elements using common implementation patterns. – Thus the program contains a design element when a portion of its code matches one of the implementation patterns. 23

Program Understanding Strategies • top-down : (with plans ) – begin with knowledge about

Program Understanding Strategies • top-down : (with plans ) – begin with knowledge about the goals the program should achieve, – determine which plans can achieve these goals, and – attempt to associate these plans to the actual program code. – this process would require matching rules or constraints to determine how this code achieves various subgoals within a plan, and difference rules to recognize how they differ from code expected by the plan. 24

– This requires detailed advance knowledge of the goals of the program which in

– This requires detailed advance knowledge of the goals of the program which in many cases may not be achievable. – Difficult to perform partial understanding since a program fragment is only ‘understood’ when it is connected to a top-level program goal. 25

 • bottom-up: starts at the code level, determines which plans might have this

• bottom-up: starts at the code level, determines which plans might have this code as a component and attempt to infer higher level goals from these plans. – Continue until the programmer’s actual goals are recognized or the understander runs out of candidate plans to match against the goals. – Tends to suffer from a potential combinatorial explosion of possible paths since each code segment could be a part of a large number of plans, etc. – This is possibly the greatest limitation (for length and complexity of programs applied to) for use of this approach. 26

 • Automated program understanders have in the past avoided the size of the

• Automated program understanders have in the past avoided the size of the search space by either – restricting the top-down searches using a limited number of plans or – performing bottom-up searches using a library containing a limited number of mostly domainindependent plans. • But understanding real-world software requires a bottom-up search and a reasonably large library. – These programs are naturally described in terms of domain-specific objects and operations; – thus we need to recognize both the plans which carry out these operations as well as the plans which represent the objects being manipulated. 27