www site uottawa caelsaddik SEG 3210 User Interface

www. site. uottawa. ca/~elsaddik Unit B: UI Evaluation 1. Objectives of User Interface Evaluation

www. site. uottawa. ca/~elsaddik 1. Objectives of User Interface Evaluation Key objective of both

www. site. uottawa. ca/~elsaddik 1. Objectives of User Interface Evaluation Questions answered by various

www. site. uottawa. ca/~elsaddik 1. Objectives of User Interface Evaluation But, in order for

www. site. uottawa. ca/~elsaddik 2. Where Evaluation Fits in the Development Process Throughout the

www. site. uottawa. ca/~elsaddik 3. A Preliminary Case Study: Hotel Reservations UI Evaluation performed

www. site. uottawa. ca/~elsaddik 3. A Preliminary Case Study: Hotel Reservations Aspects of system

www. site. uottawa. ca/~elsaddik 3. A Preliminary Case Study: Hotel Reservations Procedure: • 15

www. site. uottawa. ca/~elsaddik 3. A Preliminary Case Study: Hotel Reservations Results: • Higher

www. site. uottawa. ca/~elsaddik 4. Overview of Interface Evaluation Methods • Three types of

www. site. uottawa. ca/~elsaddik 4. Passive evaluation • • • Usage of software is

www. site. uottawa. ca/~elsaddik 4. Passive evaluation: Gathering Information b) Automatic software logs •

www. site. uottawa. ca/~elsaddik 4. Passive evaluation: Gathering Information c) Questionnaires / surveys •

www. site. uottawa. ca/~elsaddik 4. Active evaluation • • Actively study specific activities performed

www. site. uottawa. ca/~elsaddik 4. Predictive evaluation • • Studies of system by experts

www. site. uottawa. ca/~elsaddik Summary of evaluation techniques Technique When to use a) Problem

www. site. uottawa. ca/~elsaddik Comparison of key questions to evaluation techniques A B Prob

www. site. uottawa. ca/~elsaddik 5. Details: Heuristic Evaluation • A type of predictive evaluation

www. site. uottawa. ca/~elsaddik 5. Details: Heuristic Evaluation • Planning for heuristic evaluation •

www. site. uottawa. ca/~elsaddik 6. Malfunction Analysis A disciplined approach to analyzing malfunctions •

www. site. uottawa. ca/~elsaddik Q 1. How is the malfunction manifested? a) Malfunctions detected

www. site. uottawa. ca/~elsaddik Q 1. How is the malfunction manifested? c) Malfunctions undetected

www. site. uottawa. ca/~elsaddik Q 2. What Stage in the Interaction the Malfunction Occur?

www. site. uottawa. ca/~elsaddik Q 3. At Which Level Does the Malfunction Occur? a)

www. site. uottawa. ca/~elsaddik Q 3. At Which Level Does the Malfunction Occur? c)

www. site. uottawa. ca/~elsaddik Q 3. At Which Level Does the Malfunction Occur? e)

www. site. uottawa. ca/~elsaddik Q 4. Why Does the Malfunction Occur? Lack of (on

www. site. uottawa. ca/~elsaddik Q 4. Why Does the Malfunction Occur? • Physical coordination:

www. site. uottawa. ca/~elsaddik Q 4. Why Does the Malfunction Occur? Learning difficulties that

www. site. uottawa. ca/~elsaddik Q 4. Why Does the Malfunction Occur? • Problems and

www. site. uottawa. ca/~elsaddik 7. Details: Videotaped Evaluation A software engineer studies users who

www. site. uottawa. ca/~elsaddik 7. Details: Videotaped Evaluation The importance of video: • Without

www. site. uottawa. ca/~elsaddik Steps for videotaped evaluation 1. Select 6 to 8 representative

www. site. uottawa. ca/~elsaddik Steps for videotaped evaluation 7. Set up and test equipment

www. site. uottawa. ca/~elsaddik Steps for videotaped evaluation 37 Unit B-Introduction (c) elsaddik 10.

www. site. uottawa. ca/~elsaddik Steps for videotaped evaluation 11. Hold a wrap-up interview (de-briefing)

www. site. uottawa. ca/~elsaddik 8. Details: Experiments (Details in Experimentation) 1. Pick a set

www. site. uottawa. ca/~elsaddik 8. Details: Experiments (Details in Experimentation) 4. Design experiments to

www. site. uottawa. ca/~elsaddik Two major types of UI experimentation • Traditional • Micro

www. site. uottawa. ca/~elsaddik Example Experiment: Text Selection Schemes Early GUI research at Xerox

www. site. uottawa. ca/~elsaddik Example Experiment: Text Selection Schemes 2. Variables • Independent: •

www. site. uottawa. ca/~elsaddik Example Experiment: Text Selection Schemes 4. Detailed experiment design •

www. site. uottawa. ca/~elsaddik Questions to ask when reviewing experiments Not all published experiments

www. site. uottawa. ca/~elsaddik 9. Details: Usability Engineering “A process whereby the usability of

www. site. uottawa. ca/~elsaddik Usability Engineering Steps 1. Pick ‘benchmark’ tasks • 2. 3.

www. site. uottawa. ca/~elsaddik Some suitable usability metrics • • • Time to complete

www. site. uottawa. ca/~elsaddik 10. Details: Cognitive Walkthroughs A form of predictive evaluation Detailed

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps 1. Choose a task to evaluate 2.

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps e) Describe the ‘Goal Structure’ (or task

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps 3. I. For each action specified in

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps II. • • • Verify that the

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps 54 Unit B-Introduction (c) elsaddik • No-progress

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps III. Verify that the actions match the

www. site. uottawa. ca/~elsaddik 11. Key Points to Review • Objective of evaluation: Minimize

www. site. uottawa. ca/~elsaddik 11. Key Points to Review • Active methods • Traditional

www. site. uottawa. ca/~elsaddik 11. Key Points to Review • Predictive evaluation: involve experts

Slides: 58

Download presentation

www. site. uottawa. ca/~elsaddik SEG 3210 User Interface Design & Implementation Prof. Dr. -Ing. Abdulmotaleb El Saddik University of Ottawa (SITE 5 -037) (613) 562 -5800 x 6277 elsaddik @ site. uottawa. ca abed @ mcrlab. uottawa. ca http: //www. site. uottawa. ca/~elsaddik/ 1 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Unit B: UI Evaluation 1. Objectives of User Interface Evaluation 2. Where Evaluation Fits in the Development Process 3. A Preliminary Case Study: Hotel Reservations 4. Overview of Interface Evaluation Methods 5. Details: Heuristic Evaluation 6. Malfunction Analysis 7. Details: Videotaped Evaluation 8. Details: Experiments 9. Details: Usability Engineering 10. Details: Cognitive Walkthroughs 11. Key Points to Review 2 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 1. Objectives of User Interface Evaluation Key objective of both UI design and evaluation: • Minimize malfunctions Key reason for focusing on evaluation: • Without it, the designer would be working “blindfold” • Designers wouldn’t really know whether they are solving customer’s problems in the most productive way 3 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 1. Objectives of User Interface Evaluation Questions answered by various evaluation techniques: 1. What is the user’s real task? • Prevent later malfunctions • • Present and work with a UI • • by doing evaluation as part of requirements analysis to help formulate the requirements Inappropriate tasks/requirements are a major source of malfunctions 2. What problems do or might users experience with the UI? • Directly find malfunctions 3. Which of several alternative UI’s is better? • Pick the version that leads to fewer malfunctions 4. Has the UI met usability targets? • Ensure that malfunction counts are sufficiently low 5. Does the UI conform to standards? • 4 Unit B-Introduction (c) elsaddik Leverage of collective wisdom to reduce malfunctions

www. site. uottawa. ca/~elsaddik 1. Objectives of User Interface Evaluation But, in order for evaluation to give feedback to designers. . . we must understand why a malfunction occurs Malfunction analysis: • Determine why a malfunction occurs • Determine how to eliminate malfunctions We will discuss this while working through this unit. 5 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 2. Where Evaluation Fits in the Development Process Throughout the lifecycle! • During rough sketching or prototyping • During iterative design The more evaluation the better • Especially when users are involved Formative evaluation: • When designing and maintaining software that we are developing Summative evaluation: • When judging a finished product developed by someone else 6 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 3. A Preliminary Case Study: Hotel Reservations UI Evaluation performed for Forte Travelodge Performed in a special usability lab Aims: • Identify and eliminate malfunctions • Hence make system easier to use • Avoid business difficulties caused by these malfunctions • Develop improved training material and documentation • Avert potential malfunctions by teaching users how to avoid them Setup of IBM usability lab: • Resembles TV studio • • Microphones and video equipment One way mirror Technicians, observers sit on one side Users sit on other side in realistic environment • User environment resembles reception desk 7 Unit B-Introduction (c) elsaddik • Non-threatening

www. site. uottawa. ca/~elsaddik 3. A Preliminary Case Study: Hotel Reservations Aspects of system to be evaluated: • How quickly can a booking be made? • (while operator is on telephone) • Is each screen productive to use? • Are help and error messages effective? • Can non-computer-literate operators use the system? • Is complexity minimized? • Is training and documentation effective? 8 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 3. A Preliminary Case Study: Hotel Reservations Procedure: • 15 common task scenarios developed: • Among others: basic registration, cancellation, request for specific room, extension of existing stay etc. • Four days of testing with multiple users performing various sets of tasks • Users were told evaluation is of system, not them • All actions were recorded • Debriefing sessions held • Videos then analyzed for malfunctions • 62 identified • Priorities: • • 9 Unit B-Introduction (c) elsaddik Navigation speed needs improvement Screen titles and formats need tuning Hard to refer to documentation Physical difficulties with telephone headsets and furniture

www. site. uottawa. ca/~elsaddik 3. A Preliminary Case Study: Hotel Reservations Results: • Higher productivity of booking staff • tasks completed more quickly • guest requirements better met • Training costs kept low • Morale kept high • More customers booked by phone • 14500 10 Unit B-Introduction (c) elsaddik 27000 per week

www. site. uottawa. ca/~elsaddik 4. Overview of Interface Evaluation Methods • Three types of methods • Passive evaluation • Active evaluation • Predictive evaluation / usability inspections • All types of methods useful for optimal results • Used in parallel • All attempt to prevent malfunctions • Before trying methods, do pilot studies first 11 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 4. Passive evaluation • • • Usage of software is monitored Performed while prototyping, in alpha test and later Does not actively seek malfunctions • • only finds them when they happen to occur infrequent (but possibly severe) malfunctions may not be found Generally requires realistic use of a system • Users become frustrated with malfunctions Gathering Information: a) Problem report monitoring: • • 12 Unit B-Introduction (c) elsaddik Users should have an easy way to register their frustration / suggestions Best if integrated with software

www. site. uottawa. ca/~elsaddik 4. Passive evaluation: Gathering Information b) Automatic software logs • Can gather much data about usage • • • Privacy is a concern System must be designed for testability (DFT) Logs can be taken of: • • 13 Unit B-Introduction (c) elsaddik command frequency error frequency and pre-error patterns undone operations (a sign of malfunctions) just keystrokes, mouse clicks full details of interaction • The latter make accurate playback easier

www. site. uottawa. ca/~elsaddik 4. Passive evaluation: Gathering Information c) Questionnaires / surveys • • • Unit B-Introduction (c) elsaddik Proper statistical means are needed to analyze results Gathers subjective data about importance of malfunction • • • 14 Useful to obtain statistical data from large numbers of users automated logs omit importance less frequent malfunctions may be more important users can prioritize needed improvements Limit on number of questions Very hard to phrase questions well Questions can be closed- or open-ended

www. site. uottawa. ca/~elsaddik 4. Active evaluation • • Actively study specific activities performed by users Performed when prototyping and later Gathering Information: d) Experiments & usability engineering • Prove hypotheses about measurable attributes of one or more UI’s • • • e. g. speed/learning/accuracy/frustration In usability engineering test against preset targets Can be expensive Knowledge of statistics needed Hard to control for all variables e) Observation sessions • • • 15 Unit B-Introduction (c) elsaddik Also called ‘interpretive evaluation’ Simple observation or cooperative evaluation Described in detail later

www. site. uottawa. ca/~elsaddik 4. Predictive evaluation • • Studies of system by experts rather than users Performed when UI is specified and later • • • useful even before prototype developed Can eliminate many malfunctions before users ever see software Also called ‘usability inspections’ Gathering Information: f) Heuristic evaluation • • • Based on a UI design principle document Analyze whether each guideline is adhered to in the context of the task and users Can also look at adherence to standards g) Cognitive walkthroughs • 16 Unit B-Introduction (c) elsaddik Step-by-step analysis of: • • • steps in task being performed goals users form to perform these tasks how system leads user through tasks

www. site. uottawa. ca/~elsaddik Summary of evaluation techniques Technique When to use a) Problem reporting Always b) Automatic logs In any moderately complex system and whenever there are large numbers and commands c) Questionnaires Whenever there are large number of users d) Experiments & Usability Engineering In special cases where it is hard to choose between alternatives, or when fine tuning e) Observation sessions Almost always, especially when user has to interact with a client while using the system f) Heuristic evaluation Always g) Cognitive Walkthrough When usability must be optimized 17 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Comparison of key questions to evaluation techniques A B Prob Log What is task? What are current malfunctions? Does UI conform to standards? ++: Very good techniques + : OK technique 0 : possible technique 18 Unit B-Introduction (c) elsaddik D E F G Exp Obs Heu Wlk ++ + + Which UI is better? Has UI met targets? C Q? + + ++ ++ ++ + + + ++ + 0 + ++ ++

www. site. uottawa. ca/~elsaddik 5. Details: Heuristic Evaluation • A type of predictive evaluation • Use HCI experts as reviewers instead of users • Benefits of predictive evaluation: • The experts know what problems to look for • Can be done before system is built • Experts give prescriptive feedback • Important points about predictive evaluation: • Reviewers should be independent of designers • Reviewers should have experience in both the application domain and HCI • Include several experts to avoid bias • Experts must know classes of users • Beware: Novices can do some very bizarre things that experts may not anticipate 19 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 5. Details: Heuristic Evaluation • Planning for heuristic evaluation • Based on UI guidelines (heuristics about what is best) • Multiple passes needed • one pass to look for each kind of problem • passes to follow different routes through screens and dialogues (i. e. different tasks) • • 1 -2 hour sessions is good 1 expert evaluator finds only 33% of problems 5 evaluators needed to find 75% of problems 15 more to find about 99% • • • Use simple and natural dialogue Speak the user’s language Minimize memory load Be consistent Provide feedback Provide clearly marked exits Provide shortcuts Provide good error messages Prevent errors • Example heuristics for heuristic evaluation (Many more to be covered later in course) 20 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 6. Malfunction Analysis A disciplined approach to analyzing malfunctions • Provides feedback into the redesign process 1. Play protocol, searching for malfunctions 2. Answer four distinct questions: • Q 1. How is the malfunction manifested? • • Q 2. At what stage in the interaction is it occurring? • • Goal forming, action decision, action execution, interpretation of results Q 3. At what level of the user interface is it occurring? • • What do you notice and who noticed it? Physical element level to task level Q 4. Why is it occurring? • What is its root cause 3. List and prioritize possible cures 21 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Q 1. How is the malfunction manifested? a) Malfunctions detected by the system (easiest to detect) • omission of an argument • incorrect date format Cure: • Better prompts, consistency, visible examples, more forgiving of alternatives b) Malfunctions detected by the user during operation • • taking wrong path in menu hierarchy not finding required help not being able to perform a certain action not being able to tell which state system is in Cure: 22 Unit B-Introduction (c) elsaddik • Improve functionality, feedback, clarity, simplicity

www. site. uottawa. ca/~elsaddik Q 1. How is the malfunction manifested? c) Malfunctions undetected (until later) • output produced is wrong due to wrong inputs • unnecessary work performed Cure: • Improve feedback indicating consequences of input; simplify d) Inefficiencies • • • excessive response time excessive think time unnecessarily long command sequences unnecessary repetitions complex operations that require use of reference Cure: • Simplify, speed system up 23 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Q 2. What Stage in the Interaction the Malfunction Occur? a) When the user decides on next goal (Forms an intent to do inappropriate thing) • decides to empty a field because user thinks it is unimportant (when it is important) • decides to charge default exchange rate (when should obtain current exchange rate) Cure: • Lead user through task better; better feedback; better training b) When the user specifies the action (Action does not match the goal) • deletes the record instead of emptying a field • charge reciprocal of exchange rate Cure: 24 Unit B-Introduction (c) elsaddik • Improve clarity, feedback, prompts, conceptual model

www. site. uottawa. ca/~elsaddik Q 2. What Stage in the Interaction the Malfunction Occur? c) When the system executes the action • Defects in functionality Cure: • Fix functionality in normal way d) When the user interprets the resulting system state • • thinks bank account has been debited when it has not thinks system has ‘hung’ when it has not thinks some data must be entered when it is the default cannot understand resulting error message Cure: • Better feedback, better conceptual model 25 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Q 3. At Which Level Does the Malfunction Occur? a) Task level (Task and goals not supported) • What the user wants to do cannot be done by the system • Functionality is not provided Cure: • Add functionality b) Conceptual level (User has wrong mental model; does not understand intended conceptual model) • thinks that money is being deducted from bank account when it is being charged to a credit card • thinks that dragging a file to the desktop means they are no longer on the disk • thinks that dragging a disk to the trash can icon deletes disk contents Cure: • make conceptual model clearer; improve metaphors 26 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Q 3. At Which Level Does the Malfunction Occur? c) Interaction style level (system wide problem) • • does not know how to pull down a menu scrolls a page instead of a line goes to next screen instead of scrolling retypes command after an error instead of editing it Cure: • make operation of the interface more intuitive and consistent d) Interaction element level (specific detail inappropriate) • selects wrong button because label is misinterpreted • specifies invalid command syntax • specifies wrong code for option Cure: 27 Unit B-Introduction (c) elsaddik • More attention to details of the interface, simplification

www. site. uottawa. ca/~elsaddik Q 3. At Which Level Does the Malfunction Occur? e) Physical element level (Physical execution incorrect) • • presses wrong key accidentally clicks on wrong pixel in image out-types machine (actions lost) types ahead when system is computing; keystrokes later applied to wrong action Cure: • Defenses to protect user from consequences; better hardware design; fix bugs in code 28 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Q 4. Why Does the Malfunction Occur? Lack of (on the part of the user): • Motivation: • Poor job satisfaction • Attention: • User is pre-occupied with other things. • Input information processing: • No feedback provided to tell user what is going on • or cues provided by the system are not recognized • or cues are misinterpreted Cures: Clearer, more consistent feedback • Discrimination: • user is unable to tell certain things apart • e. g. red/green colour discrimination • e. g. two icons that are similar Cures: Improved expression of information 29 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Q 4. Why Does the Malfunction Occur? • Physical coordination: • e. g. wrong item selected because of difficulty positioning cursor with mouse. Cures: Alternate interaction mechanisms, better feedback • Recall: • User did not remember command , syntax etc. Cures: Better mnemonics, online help, quick lookup mechanisms, command completion • Knowledge / lack of learning: • User does not have business or software knowledge to make right choice. 30 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Q 4. Why Does the Malfunction Occur? Learning difficulties that cause malfunctions: • Learning is difficult • users get frustrated • learning takes time; can be hard to apply • Learners make ad-hoc interpretations • they may not recognize their problem • they may falsely think they have a problem • Learners generalize from what they know • they assume computers work like manual methods • they assume consistency • Learners have trouble following directions • they often ignore them even if they see them • they do not easily understand them 31 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Q 4. Why Does the Malfunction Occur? • Problems and features interact • they do not see that one problem can cause another • Prerequisites and side-effects confuse learners • Help facilities do not always help • they do not know what to ask for • too much detail is often provided • Other causes of malfunctions: • • • 32 Unit B-Introduction (c) elsaddik Excessive resource demands External events (e. g. noise) Misleading or inadequate training Unrealistic task definitions Intrinsic human variability

www. site. uottawa. ca/~elsaddik 7. Details: Videotaped Evaluation A software engineer studies users who are actively using the user interface • To observe what problems they have • Rather than to measure numbers • The sessions are videotaped • Can be done in user’s environment Activities of the user: • Performs pre-defined tasks • With or without detailed instructions on how to perform them • Preferably talks to herself as if alone in a room • Yields ‘think-aloud protocol’ • This process is called ‘co-operative’ evaluation when the software engineering and user talk to each other 33 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 7. Details: Videotaped Evaluation The importance of video: • Without it, ‘you see what you want to see’ • You interpret what you see based on your mental model • In the ‘heat of the moment’ you miss many things • Minor details (e. g. body language) captured • You can repeatedly analyze, looking for different problems Tips for using video: • Several cameras are useful • Software is available to help analyse video by dividing into segments and labeling the segments • Evaluation can be time consuming so plan it carefully 34 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Steps for videotaped evaluation 1. Select 6 to 8 representative users per user class • E. g. client, salesperson, manager, accounts receivable 2. Invite them to individual sessions • Sessions should last 30 -90 minutes • Schedule 4 -6 per day 3. If system involves user's clients in the interaction: 1. Have users bring important clients 2. or have staff pretend to be clients 4. Select facilitators/observers and notetakers 5. Prepare tasks: • Select the most commonly used tasks plus a few less important tasks • Write task instructions for users • Estimate the time it will take to complete each task plus extra time for discussion 6. Prepare notebook or form for organizing notes 35 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Steps for videotaped evaluation 7. Set up and test equipment 1. Hardware on which to run system 2. Audio or video recorder 3. Software logs 8. Do a dry run (pilot study)! 9. At the Start of an Observation Session • explain: • • • Sign informed consent form: • 36 Unit B-Introduction (c) elsaddik nature of project anticipated user contributions why user's views are important focus is on evaluating the user interface, not evaluating the user all notes, logs, etc. , are confidential user can withdraw at any time usage of devices relax! very important

www. site. uottawa. ca/~elsaddik Steps for videotaped evaluation 37 Unit B-Introduction (c) elsaddik 10. Start user verbalizing as they perform each task (thinking aloud) • • For co-operative evaluation, software engineer also verbalizes Appropriate questions to be posed by the observing software engineer: Question Malfunction if What do you want to do? They do not know; the system cannot do what they want What do you think would happen if. . . ? They do not know; they give wrong answer. What do you think the system has done? They do not know; they give wrong answer. What do you think is this information telling you? They do not know; they give wrong answer. Why did the system do that? They do not know; they give wrong answer. What were you expecting to happen? They had no expectation; they were expecting something else.

www. site. uottawa. ca/~elsaddik Steps for videotaped evaluation 11. Hold a wrap-up interview (de-briefing) 1. What were the most significant problems? 2. What was most difficult to learn? 3. Etc. 12. Analyze the videotape to find malfunctions Lab exercise: • 38 Unit B-Introduction (c) elsaddik Videotaped evaluation of a software product

www. site. uottawa. ca/~elsaddik 8. Details: Experiments (Details in Experimentation) 1. Pick a set of subjects (users) 1. A good mix to avoid biases 2. A sufficient number to get statistical significance (avoid random happenings effect results) 2. Pick variables to test • Independent: Manipulated to produce different conditions • • Should not have too many They should not affect each other too much Make sure there are no hidden variables Dependent: Measured value affected by independent 3. Develop a hypothesis • • • 39 Unit B-Introduction (c) elsaddik A prediction of the outcome Aim of experiment is to show this is correct E. g. Some change in an independent variable causes some change in a dependent variable

www. site. uottawa. ca/~elsaddik 8. Details: Experiments (Details in Experimentation) 4. Design experiments to test hypotheses • Create a null (inverse) hypothesis • • • e. a change in independent variable causes no change in dependent variable Disprove null hypothesis! Experiment design is difficult. 5. Conduct experiments 6. Statistically analyze results to draw conclusions • • e. g. using ‘t-tests’ conclusions will be correct within a margin of error 19 times out of 20 7. Decide what action to take based on conclusions 40 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Two major types of UI experimentation • Traditional • Micro level • (e. g. testing which colour is best for a certain icon) • Only done when usage will be very high or you are a university researcher! • Usability engineering • Tests significant part of system • Relaxes ‘scientific constraints’ • Because we cannot control all variables • Used to prove hypotheses that certain usability goals have been met • More later You should understand experimentation • Much UI research is experimental • You have to interpret and apply others’ results 41 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Example Experiment: Text Selection Schemes Early GUI research at Xerox on the Star Workstation • • Traditional experiments Results were used to develop Macintosh Goal of study: • Evaluate how to select text using the mouse Steps: 1. Subjects • Six groups of four • 42 Unit B-Introduction (c) elsaddik In each group, only two are experienced in mouse usage

www. site. uottawa. ca/~elsaddik Example Experiment: Text Selection Schemes 2. Variables • Independent: • Selection schemes: • • • 6 strategically chosen patterns involving --> Which mouse button (if any) could be double/triple/quad clicked to select character/word/sentence --> Which mouse button could be dragged through text --> Which mouse button could adjust the start/end of a selection Dependent • • Selection time Selection errors 3. Hypothesis • 43 Unit B-Introduction (c) elsaddik Some scheme is better than all others

www. site. uottawa. ca/~elsaddik Example Experiment: Text Selection Schemes 4. Detailed experiment design • • Null hypothesis: No difference in schemes Assign a selection scheme to each group Train the group in their scheme Measure task time and errors • • Each subject repeated 6 times A total of 24 tests per scheme 5. Conduct Experiment 6. Analysis • F-test used - scheme F found to be significantly better • • Point and draw through with left mouse Adjust with middle mouse 7. Action 44 Unit B-Introduction (c) elsaddik • • Try another combination similar to scheme F Left mouse can be double-clicked

www. site. uottawa. ca/~elsaddik Questions to ask when reviewing experiments Not all published experiments are done well! • • Were users adequately prepared? Were tasks complex enough to allow adequate evaluation? Did the task become boring to the users? Although effects are found to be statistically significant, does that matter? • Maybe not if a particular task is rarely performed • Are there any other possible interpretations? • Maybe users have learned to do better at task B because they did task A first! • Are dependent variables consistent? • e. g. users may prefer slower method • Can results be generalized? • Maybe selection results also apply to graphics, maybe not 45 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 9. Details: Usability Engineering “A process whereby the usability of a product is specified quantitatively, and in advance. Then as the product is built it can be demonstrated that it does or does not reach the required levels of usability” Partly engineering: • Design-evaluate-redesign Partly science: • Experimentation methodology • Although not with full controls 46 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Usability Engineering Steps 1. Pick ‘benchmark’ tasks • 2. 3. 4. 5. Simple tasks that can be repeated and performance measured Pick usability metrics Set planned levels of usability Design initial interface using usability guidelines Analyze impact of design(s) using experiments 1. i. e. a new batch of users is measured running the benchmark tasks 6. If goals are achieved, stop 7. Incorporate user derived feedback in design 8. Go back to step 5 Major problem with usability engineering: • 47 Unit B-Introduction (c) elsaddik Benchmark tasks are rarely performed in a truly natural environment

www. site. uottawa. ca/~elsaddik Some suitable usability metrics • • • Time to complete task Percentage of task completed per unit time Ratio of successes to failures Percentage of time spent dealing with errors Percentage of competing products that have better speed measures than our product Number of repetitions of failed commands Percentage of available commands used Number of times user had to undo an action Number of unnoticed errors Number of times user did not use the expected method to accomplish the task Think time required for task • i. e. ignoring system response time • a good UI should lead user through system with minimal ‘cognitive load’ 48 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 10. Details: Cognitive Walkthroughs A form of predictive evaluation Detailed reviews based on psychological theory, focusing on: • Goals a new user must form to execute a task • How well the system leads the user to form those goals • i. e. how well the system supports the user • The method is highly structured • Forms are provided to guide the evaluator • More time consuming that ordinary heuristic evaluation • Less time consuming than experiments 49 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps 1. Choose a task to evaluate 2. Describe the task exactly a) First describe the task in one sentence • Use simple language • The wording should be from a first-time user’s point of view e. g. Record a newly-received item in inventory. b) Describe the initial state of the system e. g. Main menu is displayed c) List the atomic actions needed to correctly perform the task, e. g. 1. Click on ‘add to inventory’ in the menu. 2. If you don’t know the part number, hit ‘return’ to perform look up the part number, then go to action 4. 3. Type the part number into the ‘part number’ field 4. Press tab 5. Type the number of items in the ‘Number’ field 6. Hit <return> or click on ‘Add’. 7. If the system prints out a bar-code sticker, affix it to the new item. d) Describe classes of users who may perform the task 50 Unit B-Introduction (c) elsaddik e. g. Receiver - knows about inventory, but not yet about the system

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps e) Describe the ‘Goal Structure’ (or task structure) users would likely have in their minds before starting the task • High-level and system independent • Indent subgoals/subtasks • Note if there actions for which the user has no goals, the system must stimulate the user to think of these goals by the time they must perform the task • If different classes of user may have different goal structures, list these too. e. g. Record a received item in inventory Started the inventory program Enter the item 51 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps 3. I. For each action specified in step 2 c, do the following (I to IV): Write down the goal structure. . . that the user would need to have in order to perform the action correctly e. g. For action 4 • • • 52 Unit B-Introduction (c) elsaddik Record a received item in inventory Record the number of items Press tab Enter the number Cause the system to process the transaction

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps II. • • • Verify that the user will have the correct goal structure given their initial goals given the system’s response to the previous action Estimate the percentage of users who might have each of the following possible problems: • Failure to add goals • • Failure to drop goals • • Unit B-Introduction (c) elsaddik e. g. The user may have a goal to, notify the person who ordered the parts This would not b needed if the system performs this automatically Addition of spurious goals • 53 e. g. For action 2 The system must make it clearly visible that pressing return with nothing entered will invoke a lookup mechanism e. g. There may be a field marked ‘Description’ However this only needs to be filled in if the type of item is not in the database

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps 54 Unit B-Introduction (c) elsaddik • No-progress impasse • e. g. After adding an item, the system might just clear the screen ready for another entry. • The user may think the transaction failed (i. e. goal not achieved) • Premature loss of goals • e. g. The user enters an item and hits “return” • A message ‘transaction accepted’ is printed (meaning the transaction has been started) • The user powers off the computer thinking the goal is reached • The system never got around to printing the label

www. site. uottawa. ca/~elsaddik Cognitive Walkthrough Steps III. Verify that the actions match the goals Possible problems: • Correct action doesn’t match goal • • e. g. User wants to delete an item that was stolen. Correct action is to select ‘add to inventory’ and specify a negative number System does not help user match the goal to the action Incorrect actions match goals • • e. g. User wants to add a new type of item to inventory (for which no items have yet been received) Upon seeing ‘add to inventory’, user selects this incorrect menu item IV. Verify that the user can physically perform the action Possible problems: • Physical difficulties • • 55 Unit B-Introduction (c) elsaddik e. g. recognizing an icon, holding down shift-ctrl-alt-a to perform a command Time-outs • i. e. running out of time – the system gives up

www. site. uottawa. ca/~elsaddik 11. Key Points to Review • Objective of evaluation: Minimize malfunctions • Key questions: • What is real task? Problems? Which is better? • Met targets? Is it standard? • • • Evaluate throughout lifecycle! Formative vs. summative evaluation Pilot studies important Use all techniques in a balanced approach Use cost-benefit analysis to see if an expensive technique will pay off • Passive methods • Problem reporting • Software logging • Questionnaires/surveys 56 Unit B-Introduction (c) elsaddik

www. site. uottawa. ca/~elsaddik 11. Key Points to Review • Active methods • Traditional experiments • • • Investigate a single UI element Pick subjects Independent and dependent variables Hypotheses Experimental designs: • independent subject • matched subject (control for differences among subjects) • repeated measures (reuse subjects) • Usability Engineering • • Test realistic ‘benchmark’ task Set targets for usability metrics Evaluate-redesign-evaluate until targets met Partly engineering, partly science • Observation sessions (Videotaped Evaluation) 57 Unit B-Introduction (c) elsaddik • Study active use on realistic tasks • Think-aloud protocol on video • Co-Operative Evaluation involves dialogue

www. site. uottawa. ca/~elsaddik 11. Key Points to Review • Predictive evaluation: involve experts • Heuristic Evaluation: based on guidelines • Cognitive Walkthroughs: goals and actions • Describe task, actions, users, goal structure • For each action, verify that users: • . . . add and drop goals as needed • . . . don’t add unneeded goals • . . . can tell when a goal is reached • . . . don’t drop needed goals • . . . can see what action to take • . . . are not mislead into taking wrong action • . . . have no physical difficulties with action • Malfunction analysis • How manifested? • Detected by: system, user • Undetected, inefficiencies • What stage in interaction? When user. . . • Decides on goal? • Executes action? Specifies action? Interprets result? • Level? (physical to task) • Why occurring? 58 Unit B-Introduction (c) elsaddik • Users lacks: e. g. motivation, input, recall