Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring

  • Slides: 23
Download presentation
Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006 -Lecture 7 1

Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006 -Lecture 7 1

What’s the plan for today? • Discourse models (cont’d) – Rhetorical Structure Theory http:

What’s the plan for today? • Discourse models (cont’d) – Rhetorical Structure Theory http: //www. sfu. ca/rst/ 2

What is RST? • A descriptive theory of discourse organization, characterizing text mostly in

What is RST? • A descriptive theory of discourse organization, characterizing text mostly in terms of relations that hold between parts of text. 3

History • RST was developed as part of a project on computerbased generation of

History • RST was developed as part of a project on computerbased generation of text by Bill Mann, Sandy Thompson and Christian Matthiessen • RST is based on studies of carefully written text of a variety of sources • RST is intended to describe texts (not processes of producing or understanding them) • RST gives an account of coherence in text 4

Elements of RST • • Relations Schema applications Structures 5

Elements of RST • • Relations Schema applications Structures 5

Relations • Relations hold between two nonoverlapping text spans – Nuclear: Satellite (denoted by

Relations • Relations hold between two nonoverlapping text spans – Nuclear: Satellite (denoted by N and S) – Multi-nuclear relations 6

Example 7

Example 7

RST tree 8

RST tree 8

Definition of relations • Constraints on nucleus • Constraints on satellite • Constraints on

Definition of relations • Constraints on nucleus • Constraints on satellite • Constraints on the combination of nucleus and satellite • The effect 9

RST schemas • Schemas define the structural constituency arrangement of text. 10

RST schemas • Schemas define the structural constituency arrangement of text. 10

RST schema applications • Unordered spans: the schemas do not constrain the order of

RST schema applications • Unordered spans: the schemas do not constrain the order of nucleus or satellites in the text span in which the schema is applied • Optional relations: for multi-relations schemas, all individual relations are optional, but at least one if the relations must hold • Repeated relations: a relation that is part of a schema can be applied any number of times in the application of that schema 11

Basic RST relations 12

Basic RST relations 12

Evidence • Relation name: EVIDENCE • Constraints on N: R might not believe N

Evidence • Relation name: EVIDENCE • Constraints on N: R might not believe N to a degree satisfactory to W(riter) • Constraints on S: The reader believes S or will find it credible • Constraints on the N+S combination: R’s comprehending S increases R’s belief of N • The effect: R’s belief of N is increased • Locus of the effect: N 13

Example 1. The program as published for calendar year 1980 really works. 2. In

Example 1. The program as published for calendar year 1980 really works. 2. In only a few minutes, I entered all the figures from my 1980 tax return 3. And got a result which agreed with my hand calculations to the penny. 2 -3 EVIDENCE for 1 14

Justify • • Relation name: JUSTIFY Constraints on N: none Constraints on S: none

Justify • • Relation name: JUSTIFY Constraints on N: none Constraints on S: none Constraints on N+S combination: R’s comprehending S increases R’s readiness to accept W’s right to present N • The effect: R’s readiness to accept W’s right to present N is increased • Locus of the effect: N 15

Antithesis • Relation name: ANTITHESIS • Constraints on N: W has positive regard for

Antithesis • Relation name: ANTITHESIS • Constraints on N: W has positive regard for the situation presented in N • Constraints on S: none • Constraints on N+S combination: the situation presented in N and S are in contrast. Because of the incompatibility that arises from contrast, one cannot have positive regard for both situations presented in N and S; comprehending S and the incompatibility between the situations presented in N and S increases R’s positive regard for the situation presented in N • The effect: R’s positive regard for N is increased • Locus of effect: N 16

Concession • Relation name: CONCESSION • Constraints on N: W has positive regard for

Concession • Relation name: CONCESSION • Constraints on N: W has positive regard for the situation presented in N • Constraints on S: W is not claiming that the situation presented in S doesn’t hold • Constraints on the N+S combination: W acknowledges a potential or apparent incompatibility between the situations presented in N and S; recognizing the incompatibility increases R’s positive regard for the situation presented in N • The effect: R’s positive regard for the situation presented in N is increased • Locus of effect: N and S 17

Example 1. Concern that this material is harmful to health or the environment may

Example 1. Concern that this material is harmful to health or the environment may be misplaced. 2. Although it is toxic to certain animals, 3. Evidence is lacking that it has any serious long -term effect on human beings. 2 CONCESSION to 3 2 -3 ELABORATION to 1 18

Span order 19

Span order 19

Distinctions among relations • Subject matter (semantic) – Two parts of the text are

Distinctions among relations • Subject matter (semantic) – Two parts of the text are understood as causally related in the subject matter – E. g. VOLITIONAL CAUSE • Presentational (pragmatic) – Facilitate presentation process – E. g. JUSTIFY 20

What is nuclearity? • Relations are mostly asymmetric – E. g. If A is

What is nuclearity? • Relations are mostly asymmetric – E. g. If A is evidence for B, then B is not evidence for A • Diagnostics for nuclearity – One member is independent of the other but not vice versa – One member is more suitable for substitution that the other. An EVIDENCE satellite can be replaced by entirely different evidence – One member is more essential to the writer’s purpose than the other 21

RST annotated corpus • Released via LDC (Language Data Consortium) – www. ldc. upenn.

RST annotated corpus • Released via LDC (Language Data Consortium) – www. ldc. upenn. edu • Information, samples of the corpus plus the RST annotation tool available at – www. isi. edu/~marcu/discourse 22

RST-based discourse parsing • “An unsupervised approach to recognizing discourse relations” (2002) by D.

RST-based discourse parsing • “An unsupervised approach to recognizing discourse relations” (2002) by D. Marcu and A. Echihabi • “The rhetorical parsing of unrestricted texts: A surface-based approach” (2000) by D. Marcu 23