CS 160 Lecture 23 Professor John Canny Fall

  • Slides: 19
Download presentation
CS 160: Lecture 23 Professor John Canny Fall 2001 Nov 27, 2001 2/24/2021 1

CS 160: Lecture 23 Professor John Canny Fall 2001 Nov 27, 2001 2/24/2021 1

Review 4 Types of help? 4 Adaptive help - user modeling - what are

Review 4 Types of help? 4 Adaptive help - user modeling - what are some ways of representing knowledge about the user? 4 What is a mixed initiative interface? 2/24/2021 2

Multimodal Interfaces 4 Multi-modal refers to interfaces that support non-GUI interaction. 4 Speech and

Multimodal Interfaces 4 Multi-modal refers to interfaces that support non-GUI interaction. 4 Speech and pen input are two common examples - and are complementary. 2/24/2021 3

Speech+pen Interfaces 4 Speech is the preferred medium for subject, verb, object expression. 4

Speech+pen Interfaces 4 Speech is the preferred medium for subject, verb, object expression. 4 Writing or gesture provide locative information (pointing etc). 2/24/2021 4

Speech+pen Interfaces 4 Speech+pen for visual-spatial tasks (compared to speech only) * * 10%

Speech+pen Interfaces 4 Speech+pen for visual-spatial tasks (compared to speech only) * * 10% faster. 36% fewer task-critical errors. Shorter and simpler linguistic constructions. 90 -100% user preference to interact this way. 2/24/2021 5

Multimodal advantages 4 Advantages for error recovery: * Users intuitively pick the mode that

Multimodal advantages 4 Advantages for error recovery: * Users intuitively pick the mode that is less errorprone. * Language is often simplified. * Users intuitively switch modes after an error, so the same problem is not repeated. 2/24/2021 6

Multimodal advantages 4 Other situations where mode choice helps: * Users with disability. *

Multimodal advantages 4 Other situations where mode choice helps: * Users with disability. * People with a strong accent or a cold. * People with RSI. * Young children or non-literate users. 2/24/2021 7

Multimodal advantages 4 For collaborative work, multimodal interfaces can communicate a lot more than

Multimodal advantages 4 For collaborative work, multimodal interfaces can communicate a lot more than text: * Speech contains prosodic information. * Gesture communicates emotion. * Writing has several expressive dimensions. 2/24/2021 8

Multimodal challenges 4 Using multimodal input generally requires advanced recognition methods: * For each

Multimodal challenges 4 Using multimodal input generally requires advanced recognition methods: * For each mode. * For combining redundant information. * For combining non-redundant information: “open this file (pointing)” 4 Information is combined at two levels: * Feature level (early fusion). * Semantic level (late fusion). 2/24/2021 9

Early fusion 4 Early fusion applies to combinations like speech+lip movement. It is difficult

Early fusion 4 Early fusion applies to combinations like speech+lip movement. It is difficult because: * Of the need for MM training data. * Because data need to be closely synchronized. * Computational and training costs. 2/24/2021 10

Late fusion 4 Late fusion is appropriate for combinations of complementary information, like pen+speech.

Late fusion 4 Late fusion is appropriate for combinations of complementary information, like pen+speech. * Recognizers are trained and used separately. * Unimodal recognizers are available off-the-shelf. * Its still important to accurately time-stamp all inputs: typical delays are known between e. g. gesture and speech. 2/24/2021 11

Contrast between MM and GUIs 4 GUI interfaces often restrict input to single non-overlapping

Contrast between MM and GUIs 4 GUI interfaces often restrict input to single non-overlapping events, while MM interfaces handle all inputs at once. 4 GUI events are unambiguous, MM inputs are based on recognition and require a probabilistic approach 4 MM interfaces are often distributed on a network. 2/24/2021 12

Agent architectures 4 Allow parts of an MM system to be written separately, in

Agent architectures 4 Allow parts of an MM system to be written separately, in the most appropriate language, and integrated easily. 4 OAA: Open-Agent Architecture (Cohen et al) supports MM interfaces. 4 Blackboards and message queues are often used to simplify inter-agent communication. * Jini, Javaspaces, Tspaces, JXTA, JMS, MSMQ. . . 2/24/2021 13

Adminstrative 4 Final project presentations are next week (Dec 4 and 6). 4 Presentations

Adminstrative 4 Final project presentations are next week (Dec 4 and 6). 4 Presentations go by group number. Groups 1 - 5 on Tuesday, groups 6 -10 on Thursday. 4 Final reports are due on Friday the 7 th. 2/24/2021 14

Symbolic/statistical approaches 4 Allow symbolic operations like unification (binding of terms like “this”) +

Symbolic/statistical approaches 4 Allow symbolic operations like unification (binding of terms like “this”) + probabilistic reasoning (possible interpretations of “this”). 4 The MTC system is an example * Members are recognizers. * Teams cluster data from recognizers. * The committee weights results from various teams. 2/24/2021 15

MTC architecture 2/24/2021 16

MTC architecture 2/24/2021 16

MM systems 4 Designers Outpost (Berkeley) 2/24/2021 17

MM systems 4 Designers Outpost (Berkeley) 2/24/2021 17

MM systems: Quickset (OGI) 2/24/2021 18

MM systems: Quickset (OGI) 2/24/2021 18

Crossweaver (Berkeley) 2/24/2021 19

Crossweaver (Berkeley) 2/24/2021 19