LING 138238 SYMBSYS 138 Intro to Computer Speech
LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing Lecture 4: October 7, 2004 Dan Jurafsky 1/10/2022 LING 138/238 Autumn 2004 1
Week 2: Dialogue and Conversational Agents • • Speech Acts and Dialogue Acts Voice. XML, continued More on design of dialogue agents Evaluation of dialogue agents 1/10/2022 LING 138/238 Autumn 2004 2
Review • • Finite-state dialogue management Frame-based dialogue management Semantic grammars ASR System, User, and Mixed-initiative Voice XML Explicit and implicit confirmation Grounding 1/10/2022 LING 138/238 Autumn 2004 3
We want more complex dialogue • We saw finite-state and frame-based dialogues • They could only handle simple dialogues • In particular, neither could handle unexpected questions from user • In fact, not clear in what we’ve seen so far how to even tell that the user has just asked us a question!!! 1/10/2022 LING 138/238 Autumn 2004 4
Speech Acts • Austin (1962): An utterance is a kind of action • Clear case: performatives – I name this ship the Titanic – I second that motion – I bet you five dollars it will snow tomorrow • Performative verbs (name, second) • Austin’s idea: not just these verbs 1/10/2022 LING 138/238 Autumn 2004 5
Each utterance is 3 acts • Locutionary act: the utterance of a sentence with a particular meaning • Illocutionary act: the act of asking, answering, promising, etc. , in uttering a sentence. • Perlocutionary act: the (often intentional) production of certain effects upon the thoughts, feelings, or actions of addressee in uttering a sentence. 1/10/2022 LING 138/238 Autumn 2004 6
Locutionary and illocutionary • “You can’t do that!” • Illocutionary force: – Protesting • Perlocutionary force: – Intent to annoy addressee – Intent to stop addressee from doing something 1/10/2022 LING 138/238 Autumn 2004 7
The 3 levels of act revisited Locutionary Force Illocutionary Force Perlocutionary Force Can I have the rest of your sandwich? Question Request You give me sandwich I want the rest of your sandwich Declarative Request You give me sandwich Give me your sandwich! Imperative Request You give me sandwich 1/10/2022 LING 138/238 Autumn 2004 8
Illocutionary Acts • What are they? 1/10/2022 LING 138/238 Autumn 2004 9
5 classes of speech acts: Searle (1975) • Assertives: committing the speaker to something’s being the case (suggesting, putting forward, swearing, boasting, concluding) • Directives: attempts by the speaker to get the addressee to do something (asking, ordering, requesting, inviting, advising, begging) • Commissives: Committing the speaker to some future course o action (promising, planning, vowing, betting, opposing). • Expressives: expressing the psychological state of the speaker about a state of affairs (thanking, apologizing, welcoming, deploring). • Declarations: bringing about a different state of the world via the utterance (I resign; You’re fired) 1/10/2022 LING 138/238 Autumn 2004 10
Dialogue acts • An act with (internal) structure related specifically to its dialogue function • Incorporates ideas of grounding • Incorporates other dialogue and conversational functions that Austin and Searle didn’t seem interested in 1/10/2022 LING 138/238 Autumn 2004 11
Verbmobil Dialogue Acts THANK thanks GREET Hello Dan INTRODUCE It’s me again BYE Allright, bye REQUEST-COMMENT How does that look? SUGGEST June 13 th through 17 th REJECT No, Friday I’m booked all day ACCEPT Saturday sounds fine REQUEST-SUGGEST What is a good day of the week for you? INIT I wanted to make an appointment with you GIVE_REASON Because I have meetings all afternoon FEEDBACK Okay DELIBERATE Let me check my calendar here CONFIRM Okay, that would be wonderful CLARIFY Okay, do you mean Tuesday the 23 rd? 1/10/2022 LING 138/238 Autumn 2004 12
Verbmobil Dialogue 1/10/2022 LING 138/238 Autumn 2004 13
DAMSL: forward looking func. STATEMENT a claim made by the speaker INFO-REQUEST a question by the speaker CHECK a question for confirming information INFLUENCE-ON-ADDRESSEE (=Searle's directives) OPEN-OPTION a weak suggestion or listing of options ACTION-DIRECTIVE an actual command INFLUENCE-ON-SPEAKER (=Austin's commissives) OFFER speaker offers to do something COMMIT speaker is committed to doing something CONVENTIONAL other OPENING greetings CLOSING farewells THANKING thanking and responding to thanks 1/10/2022 LING 138/238 Autumn 2004 14
DAMSL: backward looking func. AGREEMENT speaker's response to previous proposal ACCEPT accepting the proposal ACCEPT-PART accepting some part of the proposal MAYBE neither accepting nor rejecting the proposal REJECT-PART rejecting some part of the proposal REJECT rejecting the proposal HOLD putting off response, usually via subdialogue ANSWER answering a question UNDERSTANDING whether speaker understood previous SIGNAL-NON-UNDER. speaker didn't understand SIGNAL-UNDER. speaker did understand ACK demonstrated via continuer or assessment REPEAT-REPHRASE demonstrated via repetition or reformulation COMPLETION demonstrated via collaborative completion 1/10/2022 LING 138/238 Autumn 2004 15
1/10/2022 LING 138/238 Autumn 2004 16
Automatic Interpretation of Dialogue Acts • How do we automatically identify dialogue acts? • Given an utterance: – Decide whether it is a QUESTION, STATEMENT, SUGGEST, or ACK • Perhaps we can just look at the form of the utterance to decide? 1/10/2022 LING 138/238 Autumn 2004 17
Can we just use the surface syntactic form? • YES-NO-Q’s have auxiliary-before-subject syntax: – Will breakfast be served on USAir 1557? • STATEMENTs have declarative syntax: – I don’t care about lunch • COMMAND’s have imperative syntax: – 1/10/2022 Show me flights from Milwaukee to Orlando on Thursday night LING 138/238 Autumn 2004 18
Surface form != speech act type Locutionary Force Illocutionary Force Can I have the rest of Question your sandwich? Request I want the rest of your sandwich Declarative Request Give me your sandwich! Imperative Request 1/10/2022 LING 138/238 Autumn 2004 19
Dialogue act disambiguation is hard! • Who’s on First - Abbott and Costello routine 1/10/2022 LING 138/238 Autumn 2004 20
Dialogue act ambiguity • Who’s on first? – INFO-REQUEST – or – STATEMENT 1/10/2022 LING 138/238 Autumn 2004 21
Dialogue Act ambiguity • Can you give me a list of the flights from Atlanta to Boston? – This looks like an INFO-REQUEST. – If so, the answer is: • YES. – But really it’s a DIRECTIVE or REQUEST, a polite form of: – Please give me a list of the flights… • What looks like a QUESTION can be a REQUEST 1/10/2022 LING 138/238 Autumn 2004 22
Dialogue Act ambiguity • Similarly, what looks like a STATEMENT can be a QUESTION: Us OPEN- I was wanting to make some arrangements for a OPTION trip that I’m going to be taking uh to LA uh beginnning of the week after next Ag HOLD OK uh let me pull up your profile and I’ll be right with you here. [pause] Ag CHECK And you said you wanted to travel next week? Us ACCEPT Uh yes. 1/10/2022 LING 138/238 Autumn 2004 23
Indirect speech acts • Utterances which use a surface statement to ask a question • Utterances which use a surface question to issue a request 1/10/2022 LING 138/238 Autumn 2004 24
DA interpretation as statistical classification • Lots of clues in each sentence that can tell us which DA it is: • Words and Collocations: – Please or would you: good cue for REQUEST – Are you: good cue for INFO-REQUEST • Prosody: – Rising pitch is a good cue for INFO-REQUEST – Loudness/stress can help distinguish yeah/AGREEMENT from yeah/BACKCHANNEL • Conversational Structure – Yeah following a proposal is probably AGREEMENT; yeah following an INFORM probably a BACKCHANNEL 1/10/2022 LING 138/238 Autumn 2004 25
Example: CHECKs • Tag questions: – And it’s gonna take us also an hour to load boxcars, right? – Right • Declarative questions with rising intonation – And you said you want to travel next week? • Fragment questions – Um, curve round slightly to your right – To my right? – yes 1/10/2022 LING 138/238 Autumn 2004 26
Building a “CHECK”-detector • Checks: – – – 1/10/2022 Most often have declarative sentence structure Most likely to have rising intonation Often have a following question tag (“right? ”) Often are realized as fragments Often have the word “you”, often begin with “so” or “oh” LING 138/238 Autumn 2004 27
How to build a CHECK detector • First build detectors for various features – Parsers can tell you if it has declarative structure or not. – Word or N-gram detectors for specific words/phrases. – Speech software for extracting frequency (pitch) and energy (loudness) for the utterance. • Then either: Hand-written rules – “If it has three of the above 5 features, it’s a CHECK” • or Supervised machine learning – Create a training set, label each sentence CHECK or NOT – Run “feature extraction” software as above – Train a classifier (regression, decision tree, Naïve Bayes, maximum entropy, SVM, etc) to predict class 1/10/2022 LING 138/238 Autumn 2004 28
Prosodic Decision Tree for making S/QY/QW/QD decision 1/10/2022 LING 138/238 Autumn 2004 29
Review: Voice. XML • • Voice e. Xtensible Markup Language An XML-based dialogue design language Makes use of ASR and TTS Deals well with simple, frame-based mixed initiative dialogue. • Most common in commercial world (too limited for research systems) • But useful to get a handle on the concepts. 1/10/2022 LING 138/238 Autumn 2004 30
Review: sample vxml doc <form> <field name="transporttype"> <prompt> Please choose airline, hotel, or rental car. </prompt> <grammar type="application/x=nuance-gsl"> [airline hotel (rental car)] </grammar> </field> <block> <prompt> You have chosen <value expr="transporttype">. </prompt> </block> </form> 1/10/2022 LING 138/238 Autumn 2004 31
Review: a mixed initiative VXML doc • Mixed initiative: user might answer a different question • So Voice. XML interpreter can’t just evaluate each field of form in order • User might answer field 2 when system asked field 1 • So need grammar which can handle all sorts of input: – – 1/10/2022 Field 1 Field 2 Field 1 and field 2 etc LING 138/238 Autumn 2004 32
VXML Nuance-style grammars • Rewrite rules – Wantsentence -> I want to (fly|go) • Nuance VXML format is: – – 1/10/2022 () for concatenation, [] for disjunction Each rule has a name: Wantsentence (I want to [fly go]) Airports [(san francisco) denver] LING 138/238 Autumn 2004 33
Mixed-init VXML example (3) <noinput> I'm sorry, I didn't hear you. <reprompt/> </noinput> <nomatch> I'm sorry, I didn't understand that. <reprompt/> </nomatch> <form> <grammar type="application/x=nuance-gsl"> <![ CDATA[ • 1/10/2022 LING 138/238 Autumn 2004 34
Grammar Flight ( ? [ ] [ ) (i [wanna (want to)] [fly go]) (i'd like to [fly go]) ([(i wanna)(i'd like a)] flight) ( [from leaving departing] City: x) {<origin $x>} ( [(? going to)(arriving in)] City: x) {<dest $x>} ( [from leaving departing] City: x [(? going to)(arriving in)] City: y) {<origin $x> <dest $y>} ] ? please 1/10/2022 LING 138/238 Autumn 2004 35
Grammar City [ [(san francisco) (s f o)] {return( "san francisco, california")} [(denver) (d e n)] {return( "denver, colorado")} [(seattle) (s t x)] {return( "seattle, washington")} ] ]]> </grammar> 1/10/2022 LING 138/238 Autumn 2004 36
An example of a frame • Show me morning flights from Boston to SF on Tuesday. SHOW: FLIGHTS: ORIGIN: CITY: Boston DATE: Tuesday TIME: morning DEST: CITY: San Francisco 1/10/2022 LING 138/238 Autumn 2004 37
How to generate this semantics? • Many methods, as we will see in week 9 • Simplest: semantic grammars – LIST -> show me | I want | can I see|… – DEPARTTIME -> (after|around|before) HOUR | morning | afternoon | evening – HOUR -> one|two|three…|twelve (am|pm) – FLIGHTS -> (a) flight|flights – ORIGIN -> from CITY – DESTINATION -> to CITY – CITY -> Boston | San Francisco | Denver | Washington 1/10/2022 LING 138/238 Autumn 2004 38
Semantics for a sentence • • • LIST FLIGHTS ORIGIN Show me flights from Boston DESTINATION DEPARTDATE to San Francisco on Tuesday DEPARTTIME morning 1/10/2022 LING 138/238 Autumn 2004 39
Mixed Init dialogue (cont) <initial name="init"> <prompt> Welcome to the air travel consultant. What are your travel plans? </prompt> </initial> <field name="origin"> <prompt> Which city do you want to leave from? </prompt> <filled> <prompt> OK, from <value expr="origin"> </prompt> </filled> </field> 1/10/2022 LING 138/238 Autumn 2004 40
Mixed init dialogue continued <field name="dest"> <prompt> And which city do you want to go to? </prompt> <filled> <prompt> OK, to <value expr="dest"> </prompt> </filled> </field> <block> <prompt> OK, I have you are departing from <value expr="origin"> to <value expr="dest">. </prompt> send the info to book a flight. . . </block> </form> 1/10/2022 LING 138/238 Autumn 2004 41
Dialogue system Evaluation • Whenever we design a new algorithm or build a new application, need to evaluate it • How to evaluate a dialogue system? • What constitutes success or failure for a dialogue system? 1/10/2022 LING 138/238 Autumn 2004 42
Task Completion Success • % of subtasks completed • Correctness of each questions/answer/error msg • Correctness of total solution 1/10/2022 LING 138/238 Autumn 2004 43
Task Completion Cost • Completion time in turns/seconds • Number of queries • Turn correction ration: number of system or user turns used solely to correct errors, divided by total number of turns • Inappropriateness (verbose, ambiguous) of system’s questions, answers, error messages 1/10/2022 LING 138/238 Autumn 2004 44
User Satisfaction • Were answers provided quickly enough? • Did the system understand your requests the first time? • Do you think a person unfamiliar with computers could use the system easily? 1/10/2022 LING 138/238 Autumn 2004 45
User-centered dialogue system design 1. Early focus on users and task: – interviews, study of human-human task, etc. 2. Build prototypes: – Wizard of Oz systems 3. Iterative Design: – 1/10/2022 iterative design cycle with embedded user testing LING 138/238 Autumn 2004 46
On the way to more powerful dialogue systems • Grounding – Performing grounding – Recognizing user’s grounding • Dialogue Acts – Using correct dialogue acts – Recognizing user’s dialogue acts • Intention – Recognizing user’s intentions 1/10/2022 LING 138/238 Autumn 2004 47
Conversational Implicature • A: And, what day in May did you want to travel? • C: OK, uh, I need to be there for a meeting that’s from the 12 th to the 15 th. • Note that client did not answer question. • Meaning of client’s sentence: – Meeting • Start-of-meeting: 12 th • End-of-meeting: 15 th – Doesn’t say anything about flying!!!!! • What is it that licenses agent to infer that client is mentioning this meeting so as to inform the agent of the travel dates? 1/10/2022 LING 138/238 Autumn 2004 48
Conversational Implicature (2) • A: … there’s 3 non-stops today. • This would still be true if 7 non-stops today. • But no, the agent means: 3 and only 3. • How can client infer that agent means: – only 3 1/10/2022 LING 138/238 Autumn 2004 49
Grice: conversational implicature • Implicature means a particular class of licensed inferences. • Grice (1975) proposed that what enables hearers to draw correct inferences is: • Cooperative Principle – This is a tacit agreement by speakers and listeners to cooperate in communication 1/10/2022 LING 138/238 Autumn 2004 50
4 Gricean Maxims • Relevance: Be relevant • Quantity: Do not make your contribution more or less informative than required • Quality: try to make your contribution one that is true (don’t say things that are false or for which you lack adequate evidence) • Manner: Avoid ambiguity and obscurity; be brief and orderly 1/10/2022 LING 138/238 Autumn 2004 51
Relevance • A: Is Regina here? • B: Her car is outside. • Implication: yes – Hearer thinks: why would he mention the car? It must be relevant. How could it be relevant? It could since if her car is here she is probably here. • Client: I need to be there for a meeting that’s from the 12 th to the 15 th – Hearer thinks: Speaker is following maxims, would only have mentioned meeting if it was relevant. How could meeting be relevant? If client meant me to understand that he had to depart in time for the mtg. 1/10/2022 LING 138/238 Autumn 2004 52
Quantity • A: How much money do you have on you? • B: I have 5 dollars – Implication: not 6 dollars • Similarly, 3 non stops can’t mean 7 non-stops (hearer thinks: – if speaker meant 7 non-stops she would have said 7 nonstops • A: Did you do the reading for today’s class? • B: I intended to – Implication: No – B’s answer would be true if B intended to do the reading AND did the reading, but would then violate maxim 1/10/2022 LING 138/238 Autumn 2004 53
Planning-based Conversational Agents • How to do the kind of Gricean inference that could solve the problems we’ve discussed? • Researchers who work on this use sophisticated AI models of planning and reasoning. • Involves planning, plus various extensions to logic to create logic for Belief, Desire, Intention. • These are called BDI models (belief, desire, intention) 1/10/2022 LING 138/238 Autumn 2004 54
BDI Logic • B(S, P) = “speaker S believes proposition P” • KNOW(S, P) = P and B(S, P) • KNOWIF(S, P) =“S knows whether P” = KNOW(S, P) or KNOW(S, not. P) • W(S, P) “S wants P to be true”, where P is a state or the execution of some action • W(S, ACT(H)) = S wants H to do ACT 1/10/2022 LING 138/238 Autumn 2004 55
How to represent actions • Preconditions: – Conditions that must already be true in order to successfully perform the action • Effects: – conditions that become true as a result of successfully performing the action • Body: – A set of partially ordered goal states that must be achieved in performing the action 1/10/2022 LING 138/238 Autumn 2004 56
How to represent the action of going to the beach • GOTOBEACH(P, B) • Constraints: Person(P) & Beach(B) & Car(C) • Precondition: Know(P, location(B)) & Have(A, C) & working(C) & Want(P, At. Beach(P, B)) &… • Effect: At. Beach(P, B) • Body: Drive(P, C) 1/10/2022 LING 138/238 Autumn 2004 57
How to represent the action of booking a flight • BOOK-FLIGHT(A, C, F) • Constraints: Agent(A) & Flight(F) & Client(C) • Precondition: Know(A, dep-date(F)) & Know(A, dep-time(F)) & Know(A, origin(F)) & Has-Seats(F) & W(C, BOOK, A, C, F) & … • Effect: Flight-Booked(A, C, F) • Body: Make-Reservation(A, F, C) 1/10/2022 LING 138/238 Autumn 2004 58
Speech acts • INFORM(S, H, P) • Constraints: Speaker(S) & Hearer(H) & Proposition(P) • Precondition: Know(S, P) & W(S, INFORM(S, H, P)) • Effect: Know(H, P) • Body: B(H(W(S, Know(H, P)))) 1/10/2022 LING 138/238 Autumn 2004 59
Speech acts • • • REQUEST-INFORM(A, C, I) Constraints: Agent(A) & Client(C) Precondition: Know(C, I) Effect: Know(A, I) Body: B(C(W(A, Know(A, I)))) 1/10/2022 LING 138/238 Autumn 2004 60
How a plan-based conversational agent works • While conversation is not finished – If user has completed a turn • Then interpret user’s utterance – If system has obligations • Then address obligations – Else if system has turn • Then if system has intended conversation acts – Then call generator to produce utterances • Else if some material is ungrounded – Then address grounding situation • Else if high-level goals are unsatisfied – Then address goals • Else release turn or attempt to end conversation – Else if no one has turn or long pause • Then take turn 1/10/2022 LING 138/238 Autumn 2004 61
Plan-based agent data • Queue of conversation acts it needs to generate, based on: • Grounding: need to ground previous utterance • Dialogue obligations: answer questions, perform commands • Goals: agent must reason about its own goals 1/10/2022 LING 138/238 Autumn 2004 62
A made-up example • C: I want to go to Pittsburgh in May • System current state: – – – 1/10/2022 Discourse obligations: NONE Turn holder: system Intended speech acts: NONE Unacknowledged speech acts: INFORM-1 Discourse goals: get-travel-goal, create-travel-plan LING 138/238 Autumn 2004 63
A made-up example • System decides to add 2 conversation acts to queue: – Acknoweldge user’s inform act – Ask next travel-goal question of user • How? – Given goal “get-travel-goal” – Request-info action scheme tells system that asking the user something is one way of finding out. 1/10/2022 LING 138/238 Autumn 2004 64
A made-up example • System current state: – Discourse obligations: NONE – Turn holder: system – Intended speech acts: REQUEST-INFORM-1, ACKNOWLEDGE-1 – Unacknowledged speech acts: INFORM-1 – Discourse goals: get-travel-goal, create-travel-plan • This would be combined by clever generator: – And, what day in May did you want to travel 1/10/2022 LING 138/238 Autumn 2004 65
A made-up example • C. . I don’t think there’s many options for non-stop. • Assume DA interpreter correctly interprets this as REQUEST-INFORM 3 – – – Discourse obligations: address(REQUEST-INFORM 3) Turn holder: system Intended speech acts: NONE Unacknowledged speech acts: REQUEST-INFORM-3 Discourse goals: get-travel-goal, create-travel-plan • Manager would address discourse goal by calling planner to find out how many non-stop flights there are. Also needs to ground. 1/10/2022 LING 138/238 Autumn 2004 66
A made-up example • C. . I don’t think there’s many options for non-stop. • Since this was in the form of a indirect request, we can do an ACKNOWLEDGEMENT (if a direct request, we would do ANSWER-YES). Also need to answer the question: • Right. There’s three non-stops today. 1/10/2022 LING 138/238 Autumn 2004 67
Summary • 3 kinds of conversational agents – Finite-state: Voice. XML – Form-based: Voice. XML – Planning: Only in the research lab • Dialogue Phenomema – Grounding – Dialogue Acts – Implicature • Next Week: change in schedule: Part of Speech Tagging. 1/10/2022 LING 138/238 Autumn 2004 68
- Slides: 68