FROM SENTENCE STRUCTURE TO IMMEDIATE DISCOURSE STRUCTURE ANNOTATION

  • Slides: 46
Download presentation
FROM SENTENCE STRUCTURE TO “IMMEDIATE” DISCOURSE STRUCTURE: ANNOTATION OF DISCOURSE CONNECTIVES AND THEIR ARGUMENTS

FROM SENTENCE STRUCTURE TO “IMMEDIATE” DISCOURSE STRUCTURE: ANNOTATION OF DISCOURSE CONNECTIVES AND THEIR ARGUMENTS Aravind K. Joshi University of Pennsylvania Philadelphia, PA USA IIT, Powai, Mumbai, December 30 2005

Outline • Introduction • Transition from sentence to immediate discourse • Dependencies in discourse

Outline • Introduction • Transition from sentence to immediate discourse • Dependencies in discourse structure • Penn Discourse Treebank (PDTB) • Some properties of discourse connectives • Some examples from PDTB • Some aspects of annotation guidelines • Semantics of discourse connectives • Assigning roles to the arguments • Attributions of arguments and connectives • Summary

Transition from sentence to immediate discourse • How much information can be packaged in

Transition from sentence to immediate discourse • How much information can be packaged in a sentence? • When does a transition from a sentence to discourse happen? • Are there any general principles? • Beyond some conventions of style are there any linguistic principles to this transition?

Transition from sentence to immediate discourse • Sentences are made up of clauses •

Transition from sentence to immediate discourse • Sentences are made up of clauses • Clause: Predicate (Verb) Arguments, Adjuncts • Dependency structure • Connectives • Composition operations • Extend dependency structures to discourse • Extend the same composition operations to discourse • Extend the sentence level parser to discourse

Transition from sentence to immediate discourse • At the sentence level • Predicates have

Transition from sentence to immediate discourse • At the sentence level • Predicates have as their arguments -- NPs and clauses -- Clauses • Discourse connectives can be treated as higher order predicates taking only clauses as their arguments

Sentence Structure and Discourse Structure • At the sentence level • Structural composition and

Sentence Structure and Discourse Structure • At the sentence level • Structural composition and associated semantic composition • Anaphoric links • Other inferences • At the discourse level • Structural composition and associated semantic composition • Anaphoric links • Other inferences • Conventionally, work in discourse structure does not consider and therefore, allow such a decomposition

Dependencies in discourse structure • Discourse connectives as predicates taking clausal arguments • The

Dependencies in discourse structure • Discourse connectives as predicates taking clausal arguments • The dependencies between the predicate and their arguments can be stretched Nested Dependencies: On the one hand, Fred likes beans. Not only does he eat them for dinner. But he also eats them for breakfast and snacks. On the other hand, he’s allergic to them.

Dependencies in discourse structure • Dependencies can be stretched by nesting • Crossed dependencies

Dependencies in discourse structure • Dependencies can be stretched by nesting • Crossed dependencies do not seem to be possible • Is this cross-linguistically valid? • Apparent crossing dependencies are resolved by treating one argument of a discourse connective as anaphoric Webber, Joshi, Stone, and Knott. 2003. Anaphora and discourse structure. Computational Linguistics, 29: 545 -587.

Crossed dependencies True crossed dependencies do not seem to be possible On the one

Crossed dependencies True crossed dependencies do not seem to be possible On the one hand, Fred likes beans. Not only does he eat them for dinner. But he also eats them for breakfast and snacks. On the other hand, he’s allergic to them. * On the one hand, Fred likes beans. Not only does he eat them for dinner. On the other hand, he’s allergic to them. But he also eats them for breakfast and snacks In this sense, discourse structure may be simpler than sentence structure, even cross-linguistically?

Dependencies in discourse structure (a) John loves Barolo. (b) So he ordered three cases

Dependencies in discourse structure (a) John loves Barolo. (b) So he ordered three cases of the ’ 97. (c) But he had to cancel the order (d) Because then he discovered he was broke. because gets its arguments from (c) and (d) then gets its arguments from (b) and (d), thus crossing the connection between (c) and (d) associated with because Apparent crossing dependency: Treat the argument from (b) for then as anaphoric

Penn Discourse Treebank: PDTB • Annotate discourse connectives and their argument structure for the

Penn Discourse Treebank: PDTB • Annotate discourse connectives and their argument structure for the Penn Treebank corpus– PDTB • Independent of the specifics of the discourse lexicalized TAG (DLTAG) People: Aravind Joshi Eleni Miltsakaki, Rashmi Prasad Annotators Collaborator: Bonnie Webber (Edinburgh University)

PDTB • Discourse connectives such as -- and, or, but, because, since, while, when,

PDTB • Discourse connectives such as -- and, or, but, because, since, while, when, however, instead, although, also, for example, then, so that, insofar as, nonetheless, … , Empty Connectives -- Subordinate conjunctions, Coordinate conjunctions, Adverbial connectives, Implicit connectives -- Discourse connectives take clauses as their arguments and express relations between clauses, i. e. , relations between propositions, events, situations, … associated with the clauses

 • Towards computing a class of inferences associated with discourse connectives, hence relevant

• Towards computing a class of inferences associated with discourse connectives, hence relevant to complex NLP tasks – IE, MT, QA … • Towards discourse structure - discourse understanding Research Strategy • Not shallow vs deep syntactic processing • Not shallow vs deep semantic processing But • Deeper and deeper shallow processing

Some properties of discourse connectives • Discourse connectives have argument structure (analogous to verbs

Some properties of discourse connectives • Discourse connectives have argument structure (analogous to verbs and their argument structure) as in the Propbank. However, there are crucial differences • arity of connectives is fixed, they are binary (some apparent exceptions) • One argument is in the same sentence in which the connective appears. The other argument may or may not be in the same sentence. It can be in the preceding or following discourse • Harder to annotate the extent of an argument • one of the arguments can be anaphoric • Very little is known about the semantics of discourse connectives

What is being annotated ? • Relation: Connective--explicit or implicit • Arguments: Arg 1,

What is being annotated ? • Relation: Connective--explicit or implicit • Arguments: Arg 1, Arg 2 • Attributions of arguments • Attribution of relation • Sense of the connective • Supplementary material

Some Examples from PDTB Subordinate: because [The federal government suspended sales of U. S.

Some Examples from PDTB Subordinate: because [The federal government suspended sales of U. S. savings Bonds] because [Congress hasn’t lifted the ceiling on government debt. ] • Both arguments are in the same sentence Subordinate: although Although [started in 1965], [Wedtech didn’t really get Rolling until 1975] (when Mr. Neuberger discovered the Federal Government’s Section 8(A) minority business Program). • Both arguments are in the same sentence, one argument has possible supplementary material in ( )

Adverbial: however [Both Newsweek and U. S. News have been gaining circulation in recent

Adverbial: however [Both Newsweek and U. S. News have been gaining circulation in recent years without heavy use of electronic giveaways to subscribers, such as telephone or watches. ] However, [none of the big three weeklies recorded circulation gains recently. ] • The two arguments are in different sentences

Adverbial: for example [The computers were crude by today’s standards. ] [Apple II owners,

Adverbial: for example [The computers were crude by today’s standards. ] [Apple II owners, for example, had to use their television| sets as screens and stored data on audiocassetts. ] [The computers were crude by today’s standards. ] [Apple II owners, for example, had to use their television sets as screens and stored data on audiocassetts. ] • An argument can be a discontiguous string • Problems with aligning arguments with Penn Treebank constituents

Discourse adverbials as anaphors: Instead John wanted to eat a pear. Instead he ate

Discourse adverbials as anaphors: Instead John wanted to eat a pear. Instead he ate an apple. John will not eat fruit. Instead, he eats only candy bars and potato chips. John ate an apple. # Instead he wanted a pear. Antecedent of instead: salient but unchosen or unrealized alternative -- anaphoric argument of instead Licensing environment: modal context, negation, …

Adverbial: still [Some senior advisors argue that with further fights over a capital-gains tax

Adverbial: still [Some senior advisors argue that with further fights over a capital-gains tax cut and a budget-reduction bill Mr. Bush already has enough pending confrontations with congress. They prefer to put off the line-item veto until at least next year. ] Still, [Mr. Bush and some other aides are strongly drawn to the idea of trying out a line-item veto. ] ARG 1: Some senior… congress. They prefer…next year ARG 2: Mr. Bush…a line-item veto ARG 1 has two sentences

Adverbial: also [On the Big Board, Crawford & Co. , Atlanta, (CFD) begins trading

Adverbial: also [On the Big Board, Crawford & Co. , Atlanta, (CFD) begins trading today. ] Crawford evaluates health care plans, manages medical and disability aspects of worker’s compensation injuries and is involved in claims adjustments for insurance companies. Also, [beginning trading today on the Big Board are El Paso Refinery Limited Partnership, El Paso, Texas, (ELP) and Franklin Multi-Income Trust, San Mateo, Calif. , (FMI). ] • The sentence (in green) after the left argument of “also” can be • regarded as a kind of adjunct of the left argument • Discourse connectives have a fixed arity (2).

Empty connective: EMPTY [El Paso owns and operates a petroleum refinery. ] EMPTY= whereas

Empty connective: EMPTY [El Paso owns and operates a petroleum refinery. ] EMPTY= whereas [Franklin is a closed-end management investment company. ] • whereas is the connective that one annotator thought best described the relation expressed by the empty connective • Analogous to the empty relation in a noun-noun compound at the sentence level

Empty connective Individuals close to the situation believe Ford officials will seek a meeting

Empty connective Individuals close to the situation believe Ford officials will seek a meeting this week with Sir John to outline their proposal for a full bid. <CONSEQUENTLY> Any discussion with Ford could postpone the Jaguar. GM deal, headed for completion within the next two weeks.

Empty connectives But now the companies are getting into trouble because they undertook a

Empty connectives But now the companies are getting into trouble because they undertook a record expansion program while they were raising prices sharply. <CONSEQUENTLY/AS A RESULT> Third-quarter profits fell at several companies. Disagreement on selected connective but agreement over class

Empty connectives British government restrictions prevent any single shareholder from going beyond 15% before

Empty connectives British government restrictions prevent any single shareholder from going beyond 15% before the end of 1990 without government permission. <BECAUSE/ HOWEVER> The British government, which owned Jaguar until 1984, still holds a controlling “golden share” in the company. Disagreement over connective and also the classes they belong

Attributions of arguments and relations Advocates said the 90 -cent-an hour rise to $4.

Attributions of arguments and relations Advocates said the 90 -cent-an hour rise to $4. 25 an hour by April 1991, is too small for the working poor, while opponents argued that the increase will still hurt small Businesses and cost many thousands of jobs. Relation: Connective- while Arg 1: Advocates said…poor Arg 2: opponents … jobs Attributions: Relation: WA (writer attribution) Arg 1: WA Arg 2: WA

Attributions of arguments and relations Factory orders and construction outlays were largely flat in

Attributions of arguments and relations Factory orders and construction outlays were largely flat in September, while purchasing agents said manufacturing shrank further in October. Relation: Connective- while Arg 1: Factory orders… September Arg 2: manufacturing shrank… in October Attributions: Relation: WA Arg 1: WA Arg 2: SA (speaker attribution)

How many discourse connectives in PTB? Types: about 253 (Subordinating: 32, Coordinating: 4, Adverbial/Anaphoric:

How many discourse connectives in PTB? Types: about 253 (Subordinating: 32, Coordinating: 4, Adverbial/Anaphoric: 217) Tokens: about 23, 620 (Subordinating: 7011, Coordinating: 6169, Adverbial/Anaphoric: 10, 440) Empty connectives: Tokens: about 20, 000 Types: ? ? Total: Tokens: 43, 620

Annotation Guidelines– some comments • What counts as a discourse connective? -- in general,

Annotation Guidelines– some comments • What counts as a discourse connective? -- in general, discourse connectives convey a relation between states, events, situations, etc. • as a result is a discourse connective But in Strangely, conventional wisdom inside the Beltway regards these transfer payments as … “strangely” requires only a single state/event which it classifies in the set of “strange” events. Hence, it is not a discourse connective • What counts as an argument?

Annotation Guidelines– some comments • How far does an argument extend? Although [started in

Annotation Guidelines– some comments • How far does an argument extend? Although [started in 1965], [Wedtech didn’t really get rolling until 1975] (when Mr. Neuberger discovered the Federal Government’s Section 8 minority business Program). “Proper partial overlap” ARG 1: Wedtech didn’t really … 1975 ARG 2: started in 1965 SUP 2: when Mr. Neuberger … Program

Multiple annotations • In the standard annotation paradigm only one annotation is selected •

Multiple annotations • In the standard annotation paradigm only one annotation is selected • At the discourse level multiple annotations cannot be completely avoided [Big bear doesn’t care for disposable diapers, ] which aren’t biodegradable. Yet [parents demand them. ] Big bear doesn’t care for disposable diapers, [which aren’t biodegradable. ] Yet [parents demand them. ]

Assigning roles to the arguments For verbs • In terms of general roles such

Assigning roles to the arguments For verbs • In terms of general roles such as agent, theme, goal, instrument, … • In terms of word specific roles He wouldn’t accept anything of value from those he was writing about REL: accept Arg 0: acceptor Arg 1: thing accepted Arg 2: accepted-from Prague Dependency Treebank (PDB) (1998, 2001), Framenet (2000, 2002), Propbank (2002, 2003)

Assigning “roles” to the arguments of a connective • In terms of general roles--

Assigning “roles” to the arguments of a connective • In terms of general roles-- ? ? ? • In terms of connective specific “roles”

Roles of arguments of “if” (conditional) if (hypothetical) If John studies hard he will

Roles of arguments of “if” (conditional) if (hypothetical) If John studies hard he will pass the examination REL: if (hypothetical) ARG 0: (Truth condition) circumstances which make ARG 1 true ARG 1: (Assertion) expresses assertion

Roles of arguments of “if” (relevance conditional) if (relevance) If you are thirsty, there

Roles of arguments of “if” (relevance conditional) if (relevance) If you are thirsty, there is beer in the fridge REL: if (relevance conditional) ARG 0: (Relevance condition) circumstances in which ARG 1 is relevant ARG 1: (Assertion) expresses assertion

Roles of arguments of “if” (factual conditional) if (factual) If Bill is so unhappy

Roles of arguments of “if” (factual conditional) if (factual) If Bill is so unhappy here, he should leave REL: if (factual conditional) ARG 0: (Factual condition) someone other than the speaker believes that ARG 0 is true and ARG 0 justifies ARG 1: (Conditional assertion) expresses assertion

Some possible new senses for if [It will be at their peril] if [Americans

Some possible new senses for if [It will be at their peril] if [Americans allow another happening like the degrading Bork confirmation circus] ARG 1: it will … peril ARG 2: Americans … circus ARG 1 makes reference to ARG 2 If here is not hypothetical conditional but it is just a way of making an assertion, much like hypothetical relevance conditional but not quite like it.

Some possible new senses for if [Don’t leave home without the American Express card

Some possible new senses for if [Don’t leave home without the American Express card if [you’d really rather have a Buick. ] If here is more like the hypothetical relevance conditional but not quite like it.

Some possible senses for while [Under Chapter 11, a company operates under protection from

Some possible senses for while [Under Chapter 11, a company operates under protection from creditors’ lawsuits] while [it works out a plan to pay its debts. ] ARG 1: Under … lawsuits ARG 2: it works … debts Con: while Sense: Temporal [Some will likely be offered severance package] while [others will be transferred to overseas operations. ] Sense: Concessive

Some possible senses for while [Each company remains independent] while [working together to market

Some possible senses for while [Each company remains independent] while [working together to market and sell their products. ] Sense: Temporal/Concessive While [the insurance index fell 3. 56 to 528. 56, ] [the Nasdaq bank index fell 5. 00 to 432. 61. ] Sense: ? Compare but no real contrast

Some possible senses for since, when, … • Senses for since: Temporal, Causal, Temporal/Causal

Some possible senses for since, when, … • Senses for since: Temporal, Causal, Temporal/Causal • Senses for when: Temporal, Causal, Temporal/Causal

Since • Temporal (T) – She hasn’t played any music since the earthquake hit.

Since • Temporal (T) – She hasn’t played any music since the earthquake hit. • Causal (C) – Since the budget measures cash flow, a new $1 direct loan is treated as a $1 expenditure. • Temporal/Causal (T/C) – … and domestic car sales have plunged 19% since the Big Three ended many of their programs Sept 30.

While • Temporal (T) – A nurse contracted the virus while injecting an AIDS

While • Temporal (T) – A nurse contracted the virus while injecting an AIDS patient • Concession (Con) – The basket product, while it has got off to a slow start, is being supported by some firms. • Opposition (Opp) – … one ex-player claims he received $4000 to $5000 for his season football tickets while others said theirs brought only a few hundred dollars.

When • Temporal – The San Francisco earthquake hit when resources in the field

When • Temporal – The San Francisco earthquake hit when resources in the field already were stretched • Temporal/Causal – When the Trinity Repertory Theatre named Anne Bogart as its artistic director last spring, the nation’s theatrical cognoscenti arched a collective eyebrow

Summary • Expected date of release – April 2006 -- all explicit connectives (adjudicated)

Summary • Expected date of release – April 2006 -- all explicit connectives (adjudicated) -- all implicit connectives but only about 50% adjudicated -- some annotation senses • All connectives, all senses, and some experimental results – December 2006

Summary • Boundary between sentence and discourse • Flexible • Discourse connectives sit at

Summary • Boundary between sentence and discourse • Flexible • Discourse connectives sit at this boundary • Similarities and differences between sentence structure and local discourse structure • Properties of discourse connectives • Arguments of connectives, a-rity is 2 • Extent of the arguments and their semantics • Annotations of attributions -- Mismatch between syntax and discourse • Sense annotation—new opportunities • Multiple annotations-- implications