Politecnico di Milano Web quality Luciano Baresi Politecnico

Politecnico di Milano Web quality Luciano Baresi Politecnico di Milano – Dipartimento di Elettronica e Informazione Piazza L. da Vinci, 32 – 20133, Milano (Italy) baresi@elet. polimi. it Web Quality - Buenos Aires (Argentina) July 21 -25

Who am I? Research positions Associate professor at Politecnico di Milano (Italy) Visiting positions at University of Oregon (USA) University of Paderborn (Germany) Professional Activities Co-chair of GT-VMT 01, ICECCS 02, UMICS 03 PC member of several conferences (ICWE 04) Co-founder and senior partner of INQua. S www. elet. polimi. it/~baresi Web Quality - Buenos Aires (Argentina) July 21 -25 2

Politecnico di Milano The oldest technical university in Italy One of the oldest in Europe 5 schools 1000 faculties 25, 000 students Web Quality - Buenos Aires (Argentina) July 21 -25 3

Politecnico di Milano Quality Web Quality - Buenos Aires (Argentina) July 21 -25

A definition A methodological approach to the analysis and management of all company processes This activity aims at Reducing wastes Improving products and services Improving processes We all know what quality is Web Quality - Buenos Aires (Argentina) July 21 -25 5

Different perspectives Quality is quality but different roles have different views. For example Stakeholders They “see” the business Users They want to “use” the system Developers They must develop and maintain the system Web Quality - Buenos Aires (Argentina) July 21 -25 6

Process quality vs. product quality Process quality We aim at introducing specialpurpose characteristics to control Costs Time-to-delivery Quality Competitiveness Web Quality - Buenos Aires (Argentina) July 21 -25 Product quality We aim at products with particular quality indicators no matter of the process behind them They must be able to match explicit and implicit requests from customers 7

Quality & development process RUP Web Quality - Buenos Aires (Argentina) July 21 -25 8

Quality dimensions Usability Dependability/Reliability Correctness Robustness Scalability Performability Security internal vs. external properties Web Quality - Buenos Aires (Argentina) July 21 -25 9

Politecnico di Milano Web applications (at least their architectures) Web Quality - Buenos Aires (Argentina) July 21 -25

Web applications vs. Web sites Web Quality - Buenos Aires (Argentina) July 21 -25 11

Why Web applications? N-tier architecture Client Internet F i r e w a l l internal users Internal network SMTP server Web server DB server The Web is heterogeneous by definition Too many technologies (Applet, Servlet, JSP, ASP, XML, …) Much more importance to presentation and communication Distribution … Web Quality - Buenos Aires (Argentina) July 21 -25 12

Heterogeneity Components HTML Scripting languages Databases Multimedia contents Expertise System designers DB administrators Designers Programmers Testers Development and maintenance are complex (more complex? ) Web Quality - Buenos Aires (Argentina) July 21 -25 13

Client/Server Entities that are logically distinct, but linked by a network, that work together to obtain a given task Requests Services Web Quality - Buenos Aires (Argentina) July 21 -25 14

Logical components Presentation User interactions Data management Access, persistency, queries Application logic Business oriented computations Presentation Application logic Data management In some cases some of these elements can be absent Embedded systems have no presentation layers Web Quality - Buenos Aires (Argentina) July 21 -25 15

Client and server responsibilities Server Data management Logic Data management Logic Presentation Network Data management Client Presentation A Logic Presentation B C D E Web Quality - Buenos Aires (Argentina) July 21 -25 16

Client and server responsibilities Distributed presentation The server owns all the knowledge The client only interacts with the server Example: HTML form (no control on the quality of data) Remote presentation Presentation is fully in charge of the client Distributed logic The logic is partly on the server and partly on the client Remote data access Presentation and logic are on the client that interacts with a server to access data (SQL interface) Distributed database The functionality to manage data are partially on the client and partially on the server (e. g. , Distributed Relational Database Architecture di IBM) Web Quality - Buenos Aires (Argentina) July 21 -25 17

Fat client vs. Fat server Clients are fat if the implement part of the application logic (C, D, E) Why fat clients The system can handle user inputs more quickly The system interacts with the user with a finer granularity Different clients can implement interfaces that are specific to different users The system is more scalable Web Quality - Buenos Aires (Argentina) July 21 -25 18

Fat client vs. Fat server Why not fat clients Interactions with the server can become too frequent during complex computations Servers could better control data access if they control the computation We loose data encapsulation The client must know in detail how data are organized on the server System management and maintenance become more complex Clients and server must be both updated It would be easier if we could update only the server Web Quality - Buenos Aires (Argentina) July 21 -25 19

Three-tier architectures Client Intermediate component Server Presentation Application logic Data management In the ideal model the mid component contains all the application logic In many real cases the logic is also spread on client and server This way we can distinguish between presentation and application logic Different distributed data sources Different client types Web Quality - Buenos Aires (Argentina) July 21 -25 20

Another example Client Shared Application Services Data Sources Order Entry Credit Check Customer Service Scheduling Customer Information Distribution Service Inventory Manager Customers Distribution Inventory Credit Service Web Quality - Buenos Aires (Argentina) July 21 -25 21

From two-tier to n-tier 2 -tier Server Client Application logic Presentation Client Intermediate component Data management Intermediate component Server n-tier Web Quality - Buenos Aires (Argentina) July 21 -25 22

Computational models Computational model Host/terminal model Models based on file transfer Models based on Distributed computation Client/server Models based on events Web Quality - Buenos Aires (Argentina) July 21 -25 Peer-to-Peer Object. Oriented 23

Politecnico di Milano What can we do? Web Quality - Buenos Aires (Argentina) July 21 -25

Basically, we can Measure We “understand” the quality of our applications by measuring it Several different measures Analyze We “understand” the quality by studying the artifacts we have Interesting, but complex Test We can “understand” the quality by trying with some special executions No magic solutions !!! Web Quality - Buenos Aires (Argentina) July 21 -25 25

Politecnico di Milano Design for testability Web Quality - Buenos Aires (Argentina) July 21 -25

Testability proc foo ( ) x: integer; y: char; begin xxlskd ; xxl; Direct control over source code (usually not feasible) Definition of analysis models Interesting properties ? P Implication Algorithm control of property P’ Model property ? P´ Web Quality - Buenos Aires (Argentina) July 21 -25 27

Important factors Systems can be tested more easily if they: Work decently (operability) Can be controlled (controllability) Can be observed (observability) Do not add useless complexity (simplicity) Are documented consistently and completely (understandability) Are well-known (suitability) Are stable (stability) Web Quality - Buenos Aires (Argentina) July 21 -25 28

Five dimensions Requirements We must consider the way we represent them Design We should anticipate as many constraints as we can Implementation Test oracles should be “added” during the implementation Test We should identify test cases as soon as we can Documentation Better testability if we have good documentation Web Quality - Buenos Aires (Argentina) July 21 -25 29

Some very preliminary comments Design projects as simple as possible No added (useless) complexity Use/add contracts (assertions) to our components Maximize the visibility of all products Consider comments and documentation Write “standard” code Privilege formal (semi-formal) notations instead of informal models Is this true with the Web? Web Quality - Buenos Aires (Argentina) July 21 -25 30

noweb Web Quality - Buenos Aires (Argentina) July 21 -25 31

Politecnico di Milano Test and analysis Web Quality - Buenos Aires (Argentina) July 21 -25

Why test and analysis Software is never correct No matter of the domain No matter of the techniques we use Any software must be verified/validated Test and analysis are Important to control and assess the quality of products But impact on the process Usually expensive Difficult, but interesting Good compromises Web Quality - Buenos Aires (Argentina) July 21 -25 33

IEEE terminology Error What causes the problem (deviation) between the product and the ideal program Errors and faults are not consistent For example, typos, cut&paste, wrong requirements Fault Program elements that do not correspond to the expectations Faults do not respect locality and are not consistent with failures For example, the program has a multiply operator instead of a sum operator Failure Behavior not consistent with system specification For example, 4 + 3 = 12 Web Quality - Buenos Aires (Argentina) July 21 -25 34

Properties Process oriented (internal) properties Reusability Maintainability Modularity External properties that can be verified Interoperability Timeliness External properties that cannot be verified User-friendliness Usability Dependability properties Correctness Robustness Safety Reliability Web Quality - Buenos Aires (Argentina) July 21 -25 35

Dependability Robust but not safe: we can have catastrophic errors Reliable Correct Robust Safe, but not correct: We can have “light” failures Safe Reliable, but not correct: We can seldom have failures Correct, but not safe or robust: the specification is not enough Web Quality - Buenos Aires (Argentina) July 21 -25 36

Validation and verification Formal description Requirements validation system verification Include usability Includes test, inspection, test, user feedback static analysis building the right system building the system right Web Quality - Buenos Aires (Argentina) July 21 -25 37

Validation vs. Verification If we say that the page must display quickly, we cannot verify the property, but we can validate it If we say that the page must display in 30 seconds, we can verify it. Web Quality - Buenos Aires (Argentina) July 21 -25 38

The real problem property program Decision procedure yes/no Correctness properties are undecidable Web Quality - Buenos Aires (Argentina) July 21 -25 39

What do they offer? Sample the input space Optimistic approximation (testing) Pessimistic approximation (analysis, proof) Perfect verification Simplified properties We must settle for some kind of inaccuracy to be able to deal with the problem Web Quality - Buenos Aires (Argentina) July 21 -25 40

Impact of the software type The software type and its characteristics impact test and analysis activities in different ways Different emphasis on the same property Timeliness “Correctness” Different properties Usability User-friendliness New techniques Presentation Navigation Web Quality - Buenos Aires (Argentina) July 21 -25 41

Principles The basic principles are Sensitivity: it is better to fail every time than sometime Redundancy: we should make our intentions explicit Partitioning: divide et impera Restriction: we should try to reduce the scope Feedback: we should use the experience to improve the process Web Quality - Buenos Aires (Argentina) July 21 -25 42

Sensitivity Better to fail every time than sometimes Consistency helps: A test selection criterion is better if any selected test gives the same results, that is, if the program fails with a given test, it should fail with all tests selected with that criterion For example, deadlock analysis at run-time is better if it does not depend on the machine, that is, if the program fails on a machine within a given execution, then it should fail on all machines within that execution Web Quality - Buenos Aires (Argentina) July 21 -25 43

Redundancy Make decisions explicit Redundant control can increase the capability of capturing errors in advance or more efficiently Static type check is redundant with respect to testing, but it can solve many problems in advance The validation of requirements is redundant with respect to the validation of the final product, but it can solve several problems in advance and more efficiently Test and model checking are redundant, but they are used together in some cases to increase the confidence on the right behavior of the product Web Quality - Buenos Aires (Argentina) July 21 -25 44

Partition Divide et impera Difficult problems can be treated by partitioning the input space Criteria for both functional and structural test selection identify meaningful partitions of the input space Verification techniques partition the input space by grouping data that are homogeneous with respect to the properties that we want to prove Web Quality - Buenos Aires (Argentina) July 21 -25 45

Restriction Simplify the problem Clever restrictions can make problems that are difficult (and undecidable) simple and tractable In some cases a weaker property can be easier to verify For example, we cannot demonstrate that pointers are used in the right way, but if we use Java we can impose it easily In other cases, an heavier property can be easier to verify For example, in general we cannot demonstrate that we do not have type errors with languages that have a dynamic type system, but we can demonstrate it if the language is statically typed. Web Quality - Buenos Aires (Argentina) July 21 -25 46

Feedback Fine tune the development process Learn from the experience: Checklists are built on errors discovered in the past The way errors are classified can help define meaningful criteria to select test cases The mechanisms to revise the process are based on the fact that we must improve the process to improve the product Web Quality - Buenos Aires (Argentina) July 21 -25 47

Test and analysis in a development process Activities related to quality control and types of development processes Degrees of freedom and compromises How to balance budget, risks, and quality Error analysis and feedback Impact of the development process on test activities Responsibilities of a test group Web Quality - Buenos Aires (Argentina) July 21 -25 48

Course outline Usability Accessibility Functional correctness Testing and analysis techniques Robustness and scalability Performance Security Test process Tools Conclusions Web Quality - Buenos Aires (Argentina) July 21 -25 49

Politecnico di Milano Usability Web Quality - Buenos Aires (Argentina) July 21 -25

Usability It is the measure of how a software system satisfies the needs of its users Ease of use Efficacy and efficiency Ease of storing produced artifacts Low number of errors and ease to recover Satisfaction while using the product Jakob Nielsen Web Quality - Buenos Aires (Argentina) July 21 -25 51

Some design principles Design of pages based on the device Correct visualization with different browsers Correct visualization no matter of the screen characteristics Light-to-load pages Ease to access supplied contents Structure of contents to facilitate their reading Coherence in the page style Visibility of links and their meaning Consistency of the presentation style with respect to users Ease of navigation Web Quality - Buenos Aires (Argentina) July 21 -25 52

Page design and device Web Quality - Buenos Aires (Argentina) July 21 -25 53

Correct visualizations and browsers The semantics of tags can change in different browsers The design can Use the minimum common set, trying not to use the last version of the language Use the latest version To stimulate users to update Many users may not to be able to access some contents We should test our applications with all well-known browsers If we knew the users, we could better understand what we need Specific classes of users Skilled users We can start developing the application with old versions of the languages and Add the extensions later All main contents must be published using such a format that can be accessed by all users Web Quality - Buenos Aires (Argentina) July 21 -25 54

Screen-independent design 800 x 600 is the standard resolution Many displays are already set to this resolution If we used higher resolutions, the visualization would use horizontal sliding bars Vertical bars are acceptable, but we should avoid horizontal ones We could use relative dimensions (percentages) The visualization adapts to the current resolution and the current screen size But we must be sure that all visualizations are meaningful Web Quality - Buenos Aires (Argentina) July 21 -25 55

Design light pages Some studies have revealed that 0. 1 second is the limit to give the user the perception that the system is reacting We do not need any message, but we need to start displaying the result 1 second is the limit within which any human thought is not interrupted The user perceives the delay, but it is still acceptable 10 seconds is the limit to keep the user attention on the dialogue If delays are higher, the user changes his focus If we deliver pages in more than 10 seconds, this means that we loose the client Web Quality - Buenos Aires (Argentina) July 21 -25 56

Possible problems and solutions The speed can be influenced by the number and size of images Light and few images Multimedia effects used only if they are really helpful They improve the way information is perceived No pages should be loaded in more than 20 seconds We should check page loading in the different situations Maybe even with different cache settings First screen at a glance We should reduce the time needed to load the first page This page (the first part) should provide as much information as possible (more text than images) Web Quality - Buenos Aires (Argentina) July 21 -25 57

Neat and clear information We should use some 50% less text than “standard” newspapers When we read on a screen, we are 25% slower Users do not love to scroll windows Contents should be well-structured: titles, paragraphs, and itemized lists are useful to locate contents Rule of the pyramid (taken from journalism) Each page should start with a short conclusion (summary) and then present all details Web Quality - Buenos Aires (Argentina) July 21 -25 58

Hypertextual structure We should use hypertext to split long text in pages The hypertext should not be used to break linear information Each information peace should focus on a specific and well -defined argument We should be able to identify the entities that are more relevant and then identify their subcomponents We should then identify the relationships that exist between entities and between subcomponents within the same entity Web Quality - Buenos Aires (Argentina) July 21 -25 59

Homogeneous visualization style We should choose coherence and uniformity We should plan themes, information structures, and navigation paths that contribute to the uniformity of the site We should define a visual schema that should be adopted in all pages Same fonts, colors, rendering of contents to communicate the adopted schema and facilitate the comprehension to users We should avoid to change the style Linear transitions and uniformity give the user the idea of being in the same site (known borders) Better ease of use and learning since users can find wellknown models Web Quality - Buenos Aires (Argentina) July 21 -25 60

An example (I) Web Quality - Buenos Aires (Argentina) July 21 -25 61

An example (II) Web Quality - Buenos Aires (Argentina) July 21 -25 62

Linear transitions Same Same background fonts grid to partition the page positioning of the different page elements use of empty spaces Example: national gallery of Washington (www. nga. gov) Web Quality - Buenos Aires (Argentina) July 21 -25 63

Facilitate access to contents Ease the interaction Identification of links and their meaning Link types Structural links: They define the structure of the information space and allow the user to move within it Association links: Textual links that let the user deeper the knowledge of the text used as anchor Lists of alternative pointers: They can help users find what they need Starting rhetoric The user must be able to identify the added value that he will find in the target page Ending rhetoric The target page must clarify the ending context and give value to the source page Web Quality - Buenos Aires (Argentina) July 21 -25 64

Descrivere i link testuali We should not use long anchors The must only be pointers Too many words do not allow us to identify the meaning of the link (max 4 words) We should use simple words that clearly identify the target Examples – To know about my job click here – Here is my job We can add text to supply further information It can be used to differentiate similar links Web Quality - Buenos Aires (Argentina) July 21 -25 65

Suitable presentation styles The visual style must give the right impression that we want to communicate We should choose the right style For example http: //www. nasa. gov http: //kids. msfc. nasa. gov Web Quality - Buenos Aires (Argentina) July 21 -25 66

Ease of navigation Where am I? Where was I? Where can I go? We should always define the page title Breadcrumbs link identify the path in the site followed to reach a given page A link to the home page (or to the starting pages of the different sections) from each page help the user to understand where he is We should be careful with the links defined so far Web Quality - Buenos Aires (Argentina) July 21 -25 67

Site organization Each application must have a home page It provides a summary and links to some important pages HP A Web site is the set of pages that are linked to the home page through links The should supply an homogeneous set of information Web Quality - Buenos Aires (Argentina) July 21 -25 68

Linear structure It guides the user along a path Useful when the presentation implies that the user follow a predefined path examples: lessons, book chapters, collections, etc. HP Argument Sub. Argument 1 Argument … Argument Sub. Argument 2 User navigate forward and backward by following a predefined path Each page has a link to the initial page Web Quality - Buenos Aires (Argentina) July 21 -25 69

Network structure Typical in many simple applications (personal site) Links to and from any page in the application Home Page HP We need a navigation bar to guide the user Web Quality - Buenos Aires (Argentina) July 21 -25 Link 1 Link 2 Link 3. . Link. N 70

Hierarchical structure Home Page Area page Sub area page Contents page Contents page … Area page Contents page Areas identify and organize contents In each page we can have links to the home page and to the other pages of the area Web Quality - Buenos Aires (Argentina) July 21 -25 71

Politecnico di Milano Usability test Web Quality - Buenos Aires (Argentina) July 21 -25

A simple example Information density is inverse proportional to the search time and the capability of finding what we were searching for A useful page should not contain an excessive number of non homogenous data We have many pages, even professional ones, that are not useful according to this criterion Web Quality - Buenos Aires (Argentina) July 21 -25 73

Usability test Test with sample users to study their interaction with the system The human participation is basilar Test of HTML code to verify they are compliant with the W 3 C guidelines It can be easily made in an automatic way They are complementary techniques Web Quality - Buenos Aires (Argentina) July 21 -25 74

Usability test The tests with sample users apply techniques to collect qualitative and quantitative data while they interact with the product to stress particular features We have two main approaches with sample users The use of formally defined tests to validate or invalidate particular hypotheses (not so used) The use of iterative tests to gradually expose usability problems The goals are Identification and correction of usability problems before releasing the application Facilitate the creation of products that Are easy to learn and use Satisfy user needs Supply useful functionality to the set of target users Create a repository of tests to be used for future releases Web Quality - Buenos Aires (Argentina) July 21 -25 75

Non technical goals Minimize the costs to support product maintenance Increase the number of sold products Acquire a competitive advantage with respect to competitors Usability is a key feature for many products Minimize risks before the release Web Quality - Buenos Aires (Argentina) July 21 -25 76

Types of usability tests Analysis of user requirements Requirements specification Preliminary design Comparison tests Explorative tests Detailed design Evaluation tests Implementation Validation tests Web Quality - Buenos Aires (Argentina) July 21 -25 77

Explorative tests (I) When At the beginning of the development process When we know the user profile and how the system will be used, but we are working on the functional specifications Before working on the detailed project Goals Understand the validity of the preliminary design Verify the conceptual idea that the user has of the product (high abstraction level) Verify the hypothesis on the user Web Quality - Buenos Aires (Argentina) July 21 -25 78

Explorative tests (II) Methodology We use product prototypes to make them be evaluated by representative users At the beginning, we can also use static interfaces, maybe even on whiteboards (story boards) We must highly interact with participants to Verify the efficacy of the concepts on which the preliminary project is based Help fill the gap as to not-yet-implemented functionality Web Quality - Buenos Aires (Argentina) July 21 -25 79

Evaluation tests (I) It is the most widely used usability test When At the beginning or during the development cycle After the definition of the preliminary project Goals We want to extend the results with explorative tests to evaluate the usability of low level operations While explorative tests work on the skeleton of my product, these tests consider all characteristics We do not want to evaluate how intuitive the product is, but how a user can complete realistic tasks and maybe identify possible problems Web Quality - Buenos Aires (Argentina) July 21 -25 80

Evaluation tests (II) Methodology: The user always works on tasks, instead of surfing around and making comments We not want to understand mental processes, but we consider the actual behavior We collect quantitative measures Web Quality - Buenos Aires (Argentina) July 21 -25 81

Validation tests (I) Called also verification test When Late in the development cycle It is used to certify the usability of the product Before releasing the product Goals We want to compare the product against predefined standards or standard used by competitors We want our product to comply with standards before releasing it If it is not compliant with want to understand why These standards are defined at the beginning of the development cycle along with the usability goals Web Quality - Buenos Aires (Argentina) July 21 -25 82

Validation tests (II) Usability goals They are defined by thinking of Usability tests on previous versions Market analysis Interviews with users They are characterized by means of Criteria to measure performance (speed, user accuracy while working on a task) Criteria for user preferences Web Quality - Buenos Aires (Argentina) July 21 -25 83

Validation test (III) Methodology Similar to evaluation test Before testing, we need to identify the reference standards We must define also the tolerance with want to use to accept results (e. g. , % of failures) We propose specific tasks to participants They do not interact with the test monitor We collect quantitative data Web Quality - Buenos Aires (Argentina) July 21 -25 84

Comparison tests (I) When It is not associated with any specific step in the development process In the first phases it can be used to compare different alternatives through exploratory tests It can be used to access the efficacy of a single component It can be used to compare the product with what developed by competitors Goals It can be associated with any of the other testing methods It can be used to understand pros and cons of different projects It can be used to understand the effectiveness of different designs Web Quality - Buenos Aires (Argentina) July 21 -25 85

Comparison tests (II) Methodology We propose the vis-a-vis comparison between two or more alternatives For each alternative, we collect information and observations on performance and preferences The format we use depends on what the want to get In many cases we discover that some alternatives can be the winning ones The best results come when we compare projects that are radically different (instead of just similar) Web Quality - Buenos Aires (Argentina) July 21 -25 86

The phases of usability test Definition of test plan Selection and recruiting of participants Preparation of test material Executing the test Debriefing Conclusions and recommendations Web Quality - Buenos Aires (Argentina) July 21 -25 87

Definition of test plan Objectives: We want to describe the reasons why we want to test the application For example: We have user feedback on some particular problems with the application Test objectives: We must describe the test questions in a clear and neat way For example: is the on-line help easier through hot-keys or the mouse? Is the on-line help enough and self-contained? Is the navigation good enough to allow the user to always know where he is? Is the product usable? This is wrong because it is too vague Web Quality - Buenos Aires (Argentina) July 21 -25 88

Test plan: how to design tests We must define the test procedure by identifying the steps that testers should follow and the “tools” they should use We must decide the distribution of participants with respect to Experience, background, age, sex, . . Execution order of tasks They need a reference guide for all participants (test monitor and external observers) Different test monitors can manage the same test in equivalent ways Web Quality - Buenos Aires (Argentina) July 21 -25 89

Test plan: final report It must identify all elements that are significant to evaluate the application and that must be collected For example For each task of group of them Assigned time frame Percentage of participants that have completed their task successfully Percentage of participants that have done some mistakes Percentage of participants that have not been able to complete their task Web Quality - Buenos Aires (Argentina) July 21 -25 90

Selection and recruiting of participants Identification of classes of possible users For example: professors, expert students, novice students Definition of the optimum sample (10 -12 participants) Some participants from each class according to the frequencies in the real world For example 3 professors, 1 professor of computer science 5 students that are used to navigate in Internet 2 novice students Definition of the minimal sample (4 -5 participants) Identify the minimum number for each class 1 professor, better if not expert 2 expert students 1 novice student Web Quality - Buenos Aires (Argentina) July 21 -25 91

Preparation of test material Screening questionnaire to select participants Questionnaire to study their background Tools to collect data What data Performance: measures on the behavior of the application (e. g. , number of errors, mean time to answer) Preferences: opinions on the product (e. g. , preferences between two possible versions) How to classify them Declarations to comply with privacy rules Pre-test Post-test Web Quality - Buenos Aires (Argentina) July 21 -25 92

Test execution Introduction Presentation of the system and test Filling of a questionnaire to know the participants Identification of goals Let the participants know what we want with the test We want to evaluate the system not the participant We should be careful with cameras and microphones Execution Each participant must have a list of tasks that must be completed in a given time frame The test monitor oversee the process, but does not influence it Web Quality - Buenos Aires (Argentina) July 21 -25 93

Debriefing We interview participants on what they did during the test It is useful to understand the reasons of discovered problems and how to solve them We should not judge participants and try to score them We must try to convince participants to tell their comments Web Quality - Buenos Aires (Argentina) July 21 -25 94

Conclusions and recommendations We collect and summarize data (mean values and accuracy) We analyze data to: Identify those tests that did not match quality criteria Identify the origin/nature of the problems Discover new problems Classify priorities: Criticality = Severity + Probability Severity (to be defined with the development team): – – Unusable Hard Mild Not important Web Quality - Buenos Aires (Argentina) July 21 -25 95

Analysis of results After executing tests, the test monitor analysis results based on questioners and discussions with participants He produces a report to identify problems and limits of the system (as to usability) He can define new requirements Who is responsible for the system can decide that the system must be reworked The test monitor decides if and what tests must be reexecuted Web Quality - Buenos Aires (Argentina) July 21 -25 96

Politecnico di Milano Accessibility Web Quality - Buenos Aires (Argentina) July 21 -25

W 3 C WAI The WAI – Web Accessibility Initiative defines 14 steps that must be followed to make a Web application accessible Each step is associated with a priority level based on the impact it has on the accessibility According to the priority, we have 3 certification levels (A, AAA) A : This means that the application complies with all MUST steps AA : This means that the application complies with all MUST and SHOULD steps AAA : This means that the application complies with all MUST, SHOULD, LIKE TO steps Web Quality - Buenos Aires (Argentina) July 21 -25 98

Priorities [Priority 1] A Web content developer must satisfy this checkpoint. Otherwise, one or more groups will find it impossible to access information in the document. Satisfying this checkpoint is a basic requirement for some groups to be able to use Web documents. [Priority 2] A Web content developer should satisfy this checkpoint. Otherwise, one or more groups will find it difficult to access information in the document. Satisfying this checkpoint will remove significant barriers to accessing Web documents. [Priority 3] A Web content developer may address this checkpoint. Otherwise, one or more groups will find it somewhat difficult to access information in the document. Satisfying this checkpoint will improve access to Web documents. Web Quality - Buenos Aires (Argentina) July 21 -25 99

Example In General (Priority 1) 1. 1 Provide a text equivalent for every non-text element (e. g. , via "alt", "longdesc", or in element content). This includes: images, graphical representations of text (including symbols), image map regions, animations (e. g. , animated GIFs), applets and programmatic objects, ascii art, frames, scripts, images used as list bullets, spacers, graphical buttons, sounds (played with or without user interaction), standalone audio files, audio tracks of video, and video. 2. 1 Ensure that all information conveyed with color is also available without color, for example from context or markup. 4. 1 Clearly identify changes in the natural language of a document's text and any text equivalents (e. g. , captions). 6. 1 Organize documents so they may be read without style sheets. For example, when an HTML document is rendered without associated style sheets, it must still be possible to read the document. 6. 2 Ensure that equivalents for dynamic content are updated when the dynamic content changes. 7. 1 Until user agents allow users to control flickering, avoid causing the screen to flicker. 14. 1 Use the clearest and simplest language appropriate for a site's content. And if you use images and image maps (Priority 1) 1. 2 Provide redundant text links for each active region of a server-side image map. 9. 1 Provide client-side image maps instead of server-side image maps except where the regions cannot be defined with an available geometric shape. Web Quality - Buenos Aires (Argentina) July 21 -25 100

Politecnico di Milano Functional correctness Traditional testing Web Quality - Buenos Aires (Argentina) July 21 -25

Granularity levels Acceptance testing: the software behavior is compared with end user requirements System testing: the software behavior is compared with the requirements specifications Integration testing: checking the behavior of module cooperation. Unit testing: checking the behavior of single modules Regression testing: to check the behavior of new releases Web Quality - Buenos Aires (Argentina) July 21 -25 102

The test case generation problem How to generate test data Partition testing: divide program in (quasi-) equivalence classes random functional (black box) based on specifications structural (white box) based on code fault based on classes of faults Web Quality - Buenos Aires (Argentina) July 21 -25 103

White vs black box Black box it depends on the specification notation it scales up (different techniques at different granularity levels) it cannot reveal code bases testing (same specification implemented with different modules) Web Quality - Buenos Aires (Argentina) July 21 -25 White box it is based on control or data flow coverage it does not scale up (mostly applicable at unit and integration testing level) it cannot reveal missing path errors (part of the specification that is not implemented) 104

Specification-based Testing From formal specifications can be automated EXAMPLES: Test case generation from Algebraic specifications Finite state automata (UML class diagrams) Grammars From semi-formal specifications partitions can be easily identified can be partially automated Web Quality - Buenos Aires (Argentina) July 21 -25 105

Test-case Generation from Informal Specifications (Natural Language) cannot be automated some structure (e. g. , organization standards) can help guidelines to increase confidence level and reduce discretionality: at least on test case for each: subsets of “valid” homogeneous data “non valid” (combination of) data boundary data specific data (treated independently, error prone, . . . ) Web Quality - Buenos Aires (Argentina) July 21 -25 106

Fault-based Testing Identify a set of program locations (related to specific faults) generate alternate programs by seeding faults in the original program in the identified locations generate test cases to estimate adequacy in detecting real faults from adequacy in detecting seeded faults Web Quality - Buenos Aires (Argentina) July 21 -25 107

Partition Testing Basic idea: Divide program input space into (quasi-) equivalence classes Underlying idea of specification-based, structural, and fault-based testing Web Quality - Buenos Aires (Argentina) July 21 -25 108

The Category-Partition Method STEP 1: Analyze the specification: Identify individual functional units that can be tested separately. For each unit identify: parameters and characteristics environment and characteristics classify units into categories STEP 2: Partition the categories into choices STEP 3: Determine constraints among the choices STEP 4: Write tests and documentation Web Quality - Buenos Aires (Argentina) July 21 -25 109

The Category-Partition Method: an example. . * * From Ostrand, Balcer, The Command: Category-Partition Method for find Specifying and Generating Functional Syntax: Tests find <pattern> <file> Function: The find command is used to locate one or more instances of a given pattern in a file. All lines in the file that contain the pattern are written to standard output. A line containing the pattern is written only once, regardless of the number of times the pattern occurs in it. The pattern is any sequence of characters whose length does not exceed the maximum length of a line in the file. To include a blank in the pattern, the entire pattern must be enclosed in quotes (“). To include a quotation mark in the pattern, two quotes in a row (““) must be used. . . Web Quality - Buenos Aires (Argentina) July 21 -25 110

Step A - analyze the specification: identify categories find is an individual function that can be tested separately parameters: pattern, file characteristics (pattern) explicit (immediately derivable from specs): pattern length pattern enclosed in quotes pattern contains blanks pattern contains enclosed quotes implicit (“hidden” in specs): quoted patterns with/without blanks several successive quotes included in the pattern. . . . Web Quality - Buenos Aires (Argentina) July 21 -25 111

Step B - partition categories Parameters: Pattern size: empty single character many characters longer than any line in the file Quoting: pattern is quoted pattern is not quoted pattern is improperly quoted Embedded blanks: none several Web Quality - Buenos Aires (Argentina) July 21 -25 Parameters (cont. . . ) Embedded quotes: none several File name: . . Environment: Number of occurrences of pattern in a file: none several Pattern occurrences on target line: . . 112

Step C: Determine Constraints Parameters: Pattern size: empty single character many characters longer than any line in the file Quoting: pattern is quoted pattern is not quoted pattern is improperly quoted. . . . [property Empty] [property Non. Empty] [single] [property Quoted] [if Non. Empty] [error] Environment: Number of occurrence of pattern in a file: none [if Non. Empty] [single] [if Non. Empty] [property Match] . . . Web Quality - Buenos Aires (Argentina) July 21 -25 113

Some Considerations on the Category Partition Method a practical implementation of general principles: partition testing boundary testing erroneous conditions other approaches with similar goals, but different procedures: condition tables cause effect graphs equivalence partitioning Web Quality - Buenos Aires (Argentina) July 21 -25 114

Structural Coverage Testing (In)adequacy criteria If significant parts of program structure are not tested, testing is surely inadequate Control flow coverage criteria Statement (node, basic block) coverage Branch (edge) coverage Condition coverage Path coverage Data flow (syntactic dependency) coverage Attempted compromise between the impossible and the inadequate Web Quality - Buenos Aires (Argentina) July 21 -25 115

Statement Coverage i=0 int select(int A[], int N, int X) { int i=0; while (i<N and A[i] <X) i<N and A[i] <X True { False if (A[i]<0) A[i]<0 True A[i] = - A[i]; False i++; A[i] = - A[i]; } return(1); i++ } One test datum (N=1, A[0]=-7, X=9) is enough to guarantee statement coverage of function select Faults in handling positive values of A[i] would not be revealed Web Quality - Buenos Aires (Argentina) July 21 -25 116

Branch Coverage i=0 int select(int A[], int N, int X) { int i=0; while (i<N and A[i] <X) i<N and A[i] <X True { False if (A[i]<0) A[i]<0 True A[i] = - A[i]; False i++; A[i] = - A[i]; } return(1); i++ } We must add a test datum (N=1, A[0]=7, X=9) to cover branch False of the if statement. Faults in handling positive values of A[i] would be revealed. Faults in exiting the loop with condition A[i] <X would not be revealed Web Quality - Buenos Aires (Argentina) July 21 -25 117

Condition Coverage i=0 int select(int A[], int N, int X) { int i=0; while (i<N and A[i] <X) i<N and A[i] <X True { False if (A[i]<0) A[i]<0 True A[i] = - A[i]; False i++; A[i] = - A[i]; } return(1); i++ } Both conditions (i<N), (A[i]<X) must be false and true for different tests. In this case, we must add tests that cause the while loop to exit for a value greater than X. Faults that arise after several iterations of the loop would not be revealed. Web Quality - Buenos Aires (Argentina) July 21 -25 118

Path Coverage i=0 int select(int A[], int N, int X) { int i=0; while (i<N and A[i] <X) i<N and A[i] <X True { False if (A[i]<0) A[i]<0 True A[i] = - A[i]; False i++; A[i] = - A[i]; } return(1); } i++; The loop must be iterated given number of times. PROBLEM: uncontrolled growth of test sets. We need to select a significant subset of test cases. Web Quality - Buenos Aires (Argentina) July 21 -25 119

Data Flow Coverage int select(int A[], int N, int X) { int i=0; while (i<N and A[i] <X) { if (A[i]<0) A[i] = - A[i]; i++; } return(1); } DEF={A, N, X} DEF={i} USE={i, N, A, X} False Exercise Def-Use paths: selects paths based on effects on the variables, rather than number of iteration of loops Web Quality - Buenos Aires (Argentina) July 21 -25 True USE={A, i} False True USE={A, i} DEF{A} USE={i} DEF={i} 120

The Infeasibility Problem Syntactically indicated behaviors (paths, data flows, etc. ) are often impossible Infeasible control flow, data flow, and data states Adequacy criteria are typically impossible to satisfy Unsatisfactory approaches: Manual justification for omitting each impossible test case (esp. for more demanding criteria) Adequacy “scores” based on coverage example: 95% statement coverage, 80% def-use coverage Web Quality - Buenos Aires (Argentina) July 21 -25 121

Regression Testing a new version (release): how can we minimize effort using results of testing of previous versions? On a previous release: save scaffoldings (drivers, stubs, oracles) record test cases (<inputs, outputs>) On the new release: keep track of changes evaluate impact of changes Web Quality - Buenos Aires (Argentina) July 21 -25 122

Create Scaffolding D R I V E R initialization of non-local variables initialization of parameters activation of the unit PROGRAM UNIT S T U B ORACLE check the correspondence between the produced and the expected result “templates” of modules used by the unit (functions called by the unit) “templates” of any other entity used by the unit Web Quality - Buenos Aires (Argentina) July 21 -25 123

Problems and Tradeoffs effort in test execution and regression testing poorly designed drivers/stubs low effort in development high effort in test execution and regression testing high effort in development low effort in test execution and regression testing well designed drivers/stubs effort in developing drivers/stubs Web Quality - Buenos Aires (Argentina) July 21 -25 124

What is an oracle? An “Inspector” of executions: do test executions produce acceptable results? 13245 35968. . An oracle can be: human being machine a former version of the same program another program. . . Web Quality - Buenos Aires (Argentina) July 21 -25 125

What is a good oracle? 0 66 dv tbfd 7 b 9 f f gf 7 v 95 g fb 666 4 tfew d f 9 b 0 r 6 dgv 4 45 fvf ewfc 01 bn rsdvv 6 8 7 hum 0 f n s y e h v n 8 hh 6 ng 43 erfjm hm 88 -8 nyf n 5 h 5 n m fn dv 0 11 vfn gygir hgmh fd 766 7 b t 8 b 8 9 ghn 95 gff hnhn 66 gf 7 fewv t 4 fdfb 6 9 b 0 r 6 f v v g d bn 4 vv 45 f ewfc m 001 -6 rsd 8 u 7 h f n s y h ve hh 6 g 43 n erfjm m 888 8 nyfn h n 5 h 5 fnm 1 vfnn gygir. . . gmh 1 h 7. . . 8. n. . h. g. . . hnhn. . . . . Testing large, complex applications may require millions of test runs The size of the outputs to be inspected exceed the capabilities of human eyes are slow and unreliable examiners even of small number of outputs AUTOMATED ORACLES ARE ESSENTIAL! Web Quality - Buenos Aires (Argentina) July 21 -25 126

How can we build acceptable oracles? There is NO universal recipe Different solutions for different application domains development environments development phases • system testing • regression testing • . . . . Web Quality - Buenos Aires (Argentina) July 21 -25 • GUI • protocols • . . . • no specifications • informal specifications • . . . . 127

Oracles from Design Example: UML design notations Message sequence charts A UML message sequence chart indicates a test case and expected outcome, which can be interpreted by a driver and oracle Typical of “scenario-based” oracles scenarios combine test case with special oracle State. Chart (finite state acceptor) A UML finite state machine describes all permissible behaviors of a module oracle can be used with large numbers of automatically generated test cases Web Quality - Buenos Aires (Argentina) July 21 -25 128

Oracles from Code Documentation Parnas’ tabular annotations precisely describe the functional behavior of the unit. The table can be evaluated with respect to the produced outputs to check for their correctness. DISPLAY 1 * Display 1 Specification Find(x, A, j, present) R 0(, ) = ((1£n) and [for all (1£i£n) ‘A[i] £ ‘A[i+1]]) i[(1£i£n) and (‘A[i]=‘x)] = true false j’ | ‘A[j]=‘x true present’= true procedure find (. . . ). . end {find} false and NC(x, A) Display 1 Program Display 1 Specifications of the Invoked Programs. . * from: Parnas, Madey, Iglewski, Precise Documentation of Well. Structured Programs, IEEE-TSE Vol. . 20 N. 12 Dec 1994 Web Quality - Buenos Aires (Argentina) July 21 -25 129

Harness vs. Embedded Assertions Driver Oracle Unit or Subsystem Oracle Stubs Embedded assertions act as oracles within the unit under test Web Quality - Buenos Aires (Argentina) July 21 -25 130

Assertions as Oracles /* * Alphabetic sort of an array of strings */ void sort( char *words[ ], int nwords ) {. . . assert( is_sorted(words, nwords) ); return; } Web Quality - Buenos Aires (Argentina) July 21 -25 131

Another example: Http. Unit The main class is Web. Conversation It is a browser that interacts with a server With this class we can buid several interactions Web. Conversation wc = new Web. Conversation(); Web. Response resp = wc. get. Response("http: //httpunit. sourceforge. net/doc/Coo kbook. html"); Web. Link link = resp. get. Link. With("response"); Web Quality - Buenos Aires (Argentina) July 21 -25 132

An example Check Web Quality - Buenos Aires (Argentina) July 21 -25 133

An example Check ! Check password Web Quality - Buenos Aires (Argentina) July 21 -25 134

Politecnico di Milano Analysis Web Quality - Buenos Aires (Argentina) July 21 -25

Software Inspection: Low tech but effective Fagan Code Inspections One of many “walk-through” and inspection techniques; among the most successful More formal and well-defined than “structured walkthroughs” etc. Has been extended to designs, requirements, etc. with similar organizing principles A completely manual technique for finding and correcting errors Web Quality - Buenos Aires (Argentina) July 21 -25 136

Software Inspection Roles Moderator: Typically borrowed from another project. Chairs meeting, chooses participants, controls process Readers, Testers: Read code to group, look for flaws Author: Passive participant; answer questions when asked Web Quality - Buenos Aires (Argentina) July 21 -25 137

Software Inspection Process Planning Moderator checks entry criteria, choose participants, schedule meeting Overview Provide background education, assign roles Preparation Inspection (see ahead) Rework Follow-up (& possible re-inspection) Web Quality - Buenos Aires (Argentina) July 21 -25 138

In the Meeting Goal: Find as many faults as possible max 2 hour sessions per day approx. 150 source lines/hour Approach: Line-by-line paraphrasing Reconstruct intent of code from source May also “hand test” Find and log defects, but don’t fix them Moderator responsible for staying on track Web Quality - Buenos Aires (Argentina) July 21 -25 139

Checklists — NASA example About 2. 5 pages for C code, 4 for FORTRAN Divided into: Functionality, Data Usage, Control, Linkage, Computation, Maintenance, Clarity Examples: Does each module have a single function? Does the code match the Detailed Design? Are all constant names upper case? Are pointers not typecast (except assignment of NULL)? Are nested “INCLUDE” files avoided? Are non-standard usages isolated in subroutines and well documented? Are there sufficient comments to understand the code? Web Quality - Buenos Aires (Argentina) July 21 -25 140

Inspection Automation Although a manual technique, many kinds of automated support are possible: Automate trivial checks (e. g. , formatting) Reference: Checklists, standards w/ examples Focus (highlight, selection) on relevant parts Annotation & Communication Process guidance and (partial) enforcement e. g. , Inspe. Q will not allow check-off until all relevant parts of a document have been observed Web Quality - Buenos Aires (Argentina) July 21 -25 141

Why does inspection work? The evidence says it is cost-effective. Why? Detailed, formal process, with record keeping Check-lists; self-improving process Social aspects of process, esp. for author Consideration of whole input space Applies to incomplete programs Limitations Scale: Inherently a unit-level technique Non-incremental; what about evolution? Web Quality - Buenos Aires (Argentina) July 21 -25 142

Data flow analysis function absdiff (a, b: integer) return integer is if (a < b) tmp: integer; begin if (a < b) then tmp : = a; a : = b; b : = tmp; end if; return ( a - b ); end absdiff; Web Quality - Buenos Aires (Argentina) July 21 -25 absdiff : = a - b; 143

Classic data flow analyses to find program errors Uninitialized variable “May” result from classic “avail” analysis but conservative analysis can be annoying “Must” version is also possible (how? ) Dead assignment (no possible use) Classic “live variables” analysis In FORTRAN, Awk, BASIC, PERL, etc. , usually indicates a misspelled variable less useful in languages requiring declarations Web Quality - Buenos Aires (Argentina) July 21 -25 144

Precision & Safety An analysis is conservative (safe) if it doesn’t miss errors An analysis is precise to the extent that it doesn’t report spurious errors Static flow analysis considers all (syntactic) program paths; it can be conservative or precise, but not both An overly conservative, imprecise analysis may be useless. A well-defined but overly strict property may be preferable to spurious error reports Web Quality - Buenos Aires (Argentina) July 21 -25 145

Analysis of Models: State-Space Exploration Concurrency (multi-threading, distributed programming, . . . ) makes testing harder introduces non-determinism; time- and load-dependent bugs escape extensive testing Finite-state models can be exhaustively verified -E ? E !E accept E do . . . E Extract Web Quality - Buenos Aires (Argentina) July 21 -25 Combine Check 146

Automated Finite-State Verification G. Holzmann, “The model checker SPIN. ” IEEE TSE 23(5), May 1997 Example tool SPIN (one of many) verifies simple program-like design model high-level design of process interaction, ignoring other aspects of computation (e. g. , functional behavior) used for protocols, OS scheduling, . . . useful despite limited capacity; best for verifying highlevel design before coding Domain-specific analysis limited “proof” of simple but critical properties in a limited domain Web Quality - Buenos Aires (Argentina) July 21 -25 147

State explosion problem Size of composite state graph is product of individual state graphs. OK for a simple two-party protocol, but impossibly expensive for systems with many processes Brute force state enumeration is limited to a few processes State explosion is one face of a (provably) hard problem. The same fundamental limits appear in different form for non-enumerative analyses Web Quality - Buenos Aires (Argentina) July 21 -25 148

What is static analysis good for? Not a replacement for testing focused, (mostly) automated analysis for limited classes of faults More thorough than testing (within scope) conservative analyses are tantamount to formal verification Also augments testing, e. g. , dependence analysis for data flow testing Web Quality - Buenos Aires (Argentina) July 21 -25 149

Politecnico di Milano Load and performance test Web Quality - Buenos Aires (Argentina) July 21 -25

Robustness and scalability Robustness: It is the capability of behaving “decently” even in cases not explicitly considered in the requirements definition document If we consider Web applications, the number of users is the key factor Scalability: It is the capability of serving a given load, along with a predefined Qo. S, and being able to adapt to the evolutions of the load Web Quality - Buenos Aires (Argentina) July 21 -25 151

An example load problem Yahoo was attacked by sending thousands of emails and the servers collapsed Web Quality - Buenos Aires (Argentina) July 21 -25 152

Approaches We can analyze the architecture to discover problems in its components or in the way they interact We can verify their design and interactions We can produce models We can test the application (load test) Web Quality - Buenos Aires (Argentina) July 21 -25 153

Load test We simulate a given load on the server by means of virtual users These users behave like real users and test the capability of the system to support the load We must have the application before being able to work on load testing With this test, we can Study the HW/SW configuration needed to support a given load Discover the maximum load we want offer with the current configuration Web Quality - Buenos Aires (Argentina) July 21 -25 154

How Manual tests We use employees during the week-ends It is really expensive and we cannot simulate exceptional situations We need millions of users for Web applications Automatic tests (Mercury, Rational, Empirix): We do not need too many employees We can perform what-if simulations We can better collect statistics Available tools record the activities done by real tools, create the scripts that virtual users should execute, and measure the mean time to answer Scripts can be parametric and configurable and use a DB to store and retrieve data and generate coherent use cases Web Quality - Buenos Aires (Argentina) July 21 -25 155

Astra load test (Mercury) It can emulate thousands of users and provides a graphical monitor to identify and isolate problems It allows users to record browsing/interaction sessions It allows users to fill forms in a parametric way Web Quality - Buenos Aires (Argentina) July 21 -25 156

Astra load test The test monitor allows us to Define the number of virtual users Identify the tests that we want to execute and are already stored in the right format The machines on which we want to execute the test The monitor also displays the results even during the execution of the tests Web Quality - Buenos Aires (Argentina) July 21 -25 157

Politecnico di Milano Performance Web Quality - Buenos Aires (Argentina) July 21 -25

Performance We should always consider our clients before designing the application Its weight basically should depend on them We should carefully plan performance testing at the beginning of our project Our requirements should clearly state what we want We must clearly state pass and fail conditions A transaction that takes too longs can be seen as an error What is the meaning of “too long”? Web Quality - Buenos Aires (Argentina) July 21 -25 159

What should we test? Where? We must identify those cases are we think can be critical Trivial problems become important on the server because of the degree of parallelism Different applications have different criticalities In client-server like architectures we should always identify the bottleneck Is the server slow? Is available bandwidth limited on the client side? Are CPU and memory not sufficient? Usually, we mainly consider problems on the server side Before starting we must clearly know the HW/SW architecture Web Quality - Buenos Aires (Argentina) July 21 -25 160

Three levels Ad hoc performance testing Testers validate if the application answers with a reasonable delay. They may record failures, but they do not locate them Observational testing We use the first measures, but we use watches and similar tools Measured testing We use objective measures and specific tools Web Quality - Buenos Aires (Argentina) July 21 -25 161

What do we measure? Measures on the server Megacycles (MCs) Memory footprint Measures that consider the network Time to last byte (TTLB) User-perceived response time Other measures Bytes over the wire (Bo. W) Web Quality - Buenos Aires (Argentina) July 21 -25 162

Megacycles (MCs) CPU cost = (CPU usage * number of CPUs * CPU speed in MHz) / requests per second This measure defines the computation cost on the CPU If I know this cost and the number of users, I can foresee the number of CPUs I need When I add a new CPU, at most it gives an improvement of. 8 because of the hardware it shares (bus, ram, . . . ) Web Quality - Buenos Aires (Argentina) July 21 -25 163

Example I have two bi-processor Web servers that work at 400 MHz Their usage is equal to the 60% They must server 30 requests per second The mean value of each measure is: Available MCs =(1+0. 8)+(1+0. 8)*400= 1440 Used MCs =1440*0. 60= 864 MCs used for each request =864/30= 28. 8 Requests that we can serve per second =1440/28. 8= 50 Web Quality - Buenos Aires (Argentina) July 21 -25 164

Other measures Memory footprint It is used to study the maximum usage of memory It is not interesting if we have only static pages and images It is interesting when we have dynamic pages, accesses to DBs, and code that uses memory (Java and C#) Time to last byte (TTLB) It measures the time from the instant at which the request leaves the client till the server sends the last byte It does not consider the time used by the client to execute what receives (to execute a script, render a page) If the “execution” is complex and time-consuming, the user can have significant delays even if the TTLB is low Web Quality - Buenos Aires (Argentina) July 21 -25 165

Other measures User-perceived response time It is the time needed to fully load a page on the client and have it ready to interact with the user It is usually greater than TTLB since we must consider the overhead necessary after the last byte Bytes over the wire (Bo. W) It counts the number of bytes moved between client and server. It distinguished between: First request: the cache is empty Further request: the cache contains some data and we must transmit only what changes (dynamic data) Web Quality - Buenos Aires (Argentina) July 21 -25 166

Model-based validation Can provide quick feedback during the design Allow for the tuning of resources Internal Web page Req Client Internal LAN Req Internet Req Internal LAN page new page Web Quality - Buenos Aires (Argentina) July 21 -25 167

Performance testing We need testing tools For example: E-Test (Empirix) We create a script that corresponds to a given browsing/interaction session To measure the performance of the server with respect to the number of users To verify the performance of each single component and identify bottlenecks Kb/sec Pages/sec CPU usage by the DB Memory usage by the DB Web Quality - Buenos Aires (Argentina) July 21 -25 Transactions/sec 168

Testing services through the Web We can use some applications that allow us to test our applications without installing anything on our machines For example: www. netmechanic. com They work on the load time for each page and on the “weight” of each page or component Load Time by Modem Speed 38. 71 seconds, height/width problems Size Object URL 78698 HTML http: //www. tiscali. it/ 6743 IMG http: //www. tiscali. it/magazine/Italy/spettacoli/mytv/foto/ gino_maggio_2. jpg 5591 IMG 5246 IMG 4179 SCRIPT http: //www. tiscali. it/magazine/Italy/eventi/Media/Foto/2002/ aprile/19/junior_spazio. jpg http: //www. tiscali. it/magazine/Italy/eventi/Media/Foto/2002/ maggio/2/foto_epoca. jpg http: //www. tiscali. it/inc/hp/cookie. js Original: JPEG Size: 6743 Bytes H: 130 W: 150 DL Time (28. 8): 0: 02 Web Quality - Buenos Aires (Argentina) July 21 -25 Type: JPEG Quality: 30 Size: 4170 bytes DL Time (28. 8): 0: 01 Savings: 38 % Modem Speed 14. 4 k 28. 8 k 56 k ISDN (128 k) T 1 (1. 44 MB) Download Time 75. 42 seconds 38. 71 seconds 20. 68 seconds 10. 26 seconds 2. 73 seconds • You may want to break this Web page into several smaller pages. • Click on an image's file size to reduce it with GIFBot. • Adding HEIGHT and WIDTH attributes to your images will help browsers display your page sooner. • Adding WIDTH attributes to your TABLE tags will help browsers display your page sooner. • Your page's overall rating was lowered one level because of HTML problems. • Correct the HTML problems to get our highest rating. 169

Politecnico di Milano Security test ? ? Web Quality - Buenos Aires (Argentina) July 21 -25

Politecnico di Milano Process Web Quality - Buenos Aires (Argentina) July 21 -25

Software Qualities and Process Qualities cannot be added after development Quality results from a set of inter-dependent activities Analysis and testing are crucial but far from sufficient. Testing is not a phase, but a lifestyle Testing and analysis activities occur from early in requirements engineering through delivery and subsequent evolution. Quality depends on every part of the software process An essential feature of software processes is that software test and analysis is thoroughly integrated and not an afterthought Web Quality - Buenos Aires (Argentina) July 21 -25 172

The Quality Process Quality process: set of activities and responsibilities focused primarily on ensuring adequate dependability concerned with project schedule or with product usability The quality process provides a framework for selecting and arranging activities considering interactions and trade-offs with other important goals. Web Quality - Buenos Aires (Argentina) July 21 -25 173

Interactions and tradeoffs Example: high dependability vs. time to market Mass market products: Better to achieve a reasonably high degree of dependability on a tight schedule Than to achieve ultra-high dependability on a much longer schedule Critical medical devices: Better to achieve ultra-high dependability on a much longer schedule Than a reasonably high degree of dependability on a tight schedule Web Quality - Buenos Aires (Argentina) July 21 -25 174

Properties of the Quality Process Completeness: appropriate activities are planned to detect each important class of faults. Timeliness: faults are detected at a point of high leverage (as early as possible) Cost-effectiveness: activities are chosen depending on cost and effectiveness cost must be considered over the whole development cycle and product life the dominant factor is likely to be the cost of repeating an activity through many change cycles. Web Quality - Buenos Aires (Argentina) July 21 -25 175

Planning and Monitoring The quality process balances several activities across the whole development process selects and arranges them to be as cost-effective as possible improves early visibility Quality goals can be achieved only through careful planning Planning is integral to the quality process Web Quality - Buenos Aires (Argentina) July 21 -25 176

Process Visibility A process is visible to the extent that one can answer the question How does our progress compare to our plan? ‘ Are we on schedule? How far ahead or behind? The quality process has not achieved adequate visibility if one cannot gain strong confidence in the quality of the software system before it reaches final testing quality activities are usually placed as early as possible design test cases at the earliest opportunity (not “just in time'') uses analysis techniques on software artifacts produced before actual code Web Quality - Buenos Aires (Argentina) July 21 -25 177

A&T Strategy Identifies company- or project-wide standards that must be satisfied procedures required for obtaining quality certificates techniques and tools that must be used documents that must be produced Web Quality - Buenos Aires (Argentina) July 21 -25 178

Quality plan a comprehensive description of the quality process that includes: objectives and scope of quality activities documents and other items that must be available items to be tested features to be tested and not to be tested analysis and test activities staff involved in quality constraints pass and fail criteria schedule deliverables hardware and software requirements risks and contingencies Web Quality - Buenos Aires (Argentina) July 21 -25 179

Improving the Process Long lasting errors are common It is important to structure the process for Identifying the most critical persistent faults tracking them to frequent errors adjusting the development and quality processes to eliminate errors Feedback mechanisms are the main ingredient of the quality process for identifying and removing errors Web Quality - Buenos Aires (Argentina) July 21 -25 180

Organizational factors Different teams for development and quality? separate development and quality teams is common in large organizations indistinguishable roles is postulated by some methodologies (extreme programming) Different roles for development and quality? test designer is a specific role in many organizations mobility of people and roles by rotating engineers over development and testing tasks among different projects is a possible option Web Quality - Buenos Aires (Argentina) July 21 -25 181

Example of Allocation of Responsibilities Allocating tasks and responsibilites is a complex job: we can allocate Unit testing to the development team (requires detailed knowledge of the code) but the quality team may control the results (structural coverage) Integration, system and acceptance testing to the quality team but the development team may produce scaffolding and oracles Inspection and walk-through to mixed teams Regression testing to quality and maintenance teams Process improvement related activities to external specialists interacting with all teams Web Quality - Buenos Aires (Argentina) July 21 -25 182

Product vs. process improvement Product improvement: Fault is detected (by inspection, testing, user report, . . . ) Fault is diagnosed and repaired Process improvement Faults are detected (and maybe repaired) Fault record is analyzed to tune process Web Quality - Buenos Aires (Argentina) July 21 -25 183

Fault analysis What are the faults? Categorize by kind (Memory leak, interface error, misfeature, etc. ) And by severity When did they occur? And when found? Coding? Design? Requirements? Why did they occur? Look for “root causes” How could they be prevented? Web Quality - Buenos Aires (Argentina) July 21 -25 184

Categorizing faults There is no “right” categorization May depend on design style, implementation language, process and documents, . . . Should probably be revised occasionally Goal is enough precision for “Pareto” analysis (80/20 rule) considering severity and cost Categorization needn’t be perfect or painful, but keeping records is essential Web Quality - Buenos Aires (Argentina) July 21 -25 185

Fault severity Typical breakdown (of failures): Critical: Product is unusable Severe: Product feature cannot be used; no workaround Moderate: Product feature can be used only with workaround (loss of efficiency, reliability, or significant loss of convenience) Cosmetic or minor inconvenience Cost may be distinct from severity Web Quality - Buenos Aires (Argentina) July 21 -25 186

80/20 rule (a. k. a. Pareto analysis) Identify one or two “dominant” fault categories Considering severity, cost, and frequency Further problem analysis is limited to these Categories may “level” over time A good time for rethinking the categories Web Quality - Buenos Aires (Argentina) July 21 -25 187

Test Documentation Must be an organization standard Depends on organization (size, turnover, . . ), type of software (criticality, average life, complexity, number of versions, . . . ) It must include at least: test suite documentation test case documentation Web Quality - Buenos Aires (Argentina) July 21 -25 188

Test Documentation (cont. . ) Documentation of test suites: software tested version goal overall results author Documentation of Test Cases: goal “environment” (driver, stub, oracle) input expected output actual output result observations Web Quality - Buenos Aires (Argentina) July 21 -25 189

Politecnico di Milano Tools Web Quality - Buenos Aires (Argentina) July 21 -25

Tools Test process management Validator/Req Test Director Rational Robot Mercury Win. Runner Segue Silk. Test Compuware QA Run Web Quality - Buenos Aires (Argentina) July 21 -25 Usability Web. SAT Bobby Page Valet W 3 C HTML Validator Service HTML Authoring Service Xenu Alert Link. Runner HTML Link Validator HTML Validator 5. 0 Web Link Validator 191

Tools Performance Web Performance Trainer Httpload Webserver Stress Tool Http/s Load Open. STA j. Meter Coverage Logiscope Test. Cecker Deep Cover j. Cover LDRA Test. Bed Attol Coverage (Rational) Scaffolding ATTOL Unit Test (NOW Rational) Cantata++ j. Unit Cactus HTTPunit Memory leaks Purify (Rational) Sentinel Web Quality - Buenos Aires (Argentina) July 21 -25 Static analysis Logiscope Audit LDRA Test. Bed 192

Politecnico di Milano References Web Quality - Buenos Aires (Argentina) July 21 -25

Web sites www. w 3 c. org www. useit. com www. softwareqatest. com/qatweb 1. html www. junit. org/index. htm www. swquality. com/users/pustaver/index. shtml standards. ieee. org/catalog/olis/se. html www. stickyminds. com www. qaforums. com www. qualitytree. com www. cs. uoregon. edu/~michal/book/ Web Quality - Buenos Aires (Argentina) July 21 -25 194

Books Usability Jakob Nielsen, “Designing Web Usability: The Practice of Simplicity”, New Riders Publishing, Indianapolis, 2000 Software engineering Ian Sommerville, "Software Engineering". Addison-Wesley, 2000 Roger S. Pressman, "Software Engineering: A Practitioner's Approach". Mc. Graw-Hill, 2000 Test in general Cem Kaner, Hung Quoc Nguyen, Jack Falk, "Testing Computer Software", John Wiley & Sons, 1999 Web Quality - Buenos Aires (Argentina) July 21 -25 195

Books Test for object-oriented software Robert V. Binder, "Testing Object-Oriented Systems: Models, Patterns, and Tools", Addison-Wesley, 1999 Software metrics Norman E. Fenton, Shari Lawrence Pfleeger, "Software Metrics: A Rigorous and Practical Approach, Revised", Brooks/Cole, 1998 Stephen H. Kan, "Metrics and Models in Software Quality Engineering", Addison-Wesley, 1995 Web Quality - Buenos Aires (Argentina) July 21 -25 196

Politecnico di Milano Grazie !!!! Luciano Baresi DEI - Politecnico di Milano Piazza L. da Vinci, 32 - 20133 Milano (Italia) tel: 02 2399 3638 email: baresi@elet. polimi. it www. elet. polimi. it/~baresi Web Quality - Buenos Aires (Argentina) July 21 -25

Homework Study and analyze one of the arguments presented in these days Some particular methodologies Some interesting examples … Imagine you are a test manager and try to identify the activities that should be carried out to test a Web application For example, www. amazon. com Consider what, how, and when Try to estimate costs in terms of person months Constraints 10 days 5 pages (at most) Web Quality - Buenos Aires (Argentina) July 21 -25 198