Special Electives of Comp Linguistics Processing Anaphoric Expressions

  • Slides: 26
Download presentation
Special Electives of Comp. Linguistics: Processing Anaphoric Expressions Eleni Miltsakaki AUTH Fall 2005 -Lecture

Special Electives of Comp. Linguistics: Processing Anaphoric Expressions Eleni Miltsakaki AUTH Fall 2005 -Lecture 1 1

Let’s introduce ourselves • Course: Special Electives of CL: Processing Anaphoric Expressions (Ling 2

Let’s introduce ourselves • Course: Special Electives of CL: Processing Anaphoric Expressions (Ling 2 -342) • Meeting times: Friday 11: 00 -14: 00 • Office hours: Friday 9: 30 -11: 00 • Prof: Eleni Miltsakaki • BA Aristotle University -- English & American Lang. & Lit. • MA University of Essex, UK -- Applied Linguistics • Ph. D University of Pennsylvania, USA -- Theoretical and Computational Linguistics • Email: elenimi@enl. auth. gr http: //www. cis. upenn. edu/~elenimi • Students: ? 2

Brief outline • What is computational linguistics (CL)? – Why is it hard for

Brief outline • What is computational linguistics (CL)? – Why is it hard for computers to understand human languages? – What are some practical applications of CL? • What we will do in this course? – What is anaphora and anaphor resolution? – Why is it hard? • Tentative syllabus and course projects 3

What is Computational Linguistics? • A discipline between Linguistics and Computer Science èConcerned with

What is Computational Linguistics? • A discipline between Linguistics and Computer Science èConcerned with the computational aspects of human language processing èHas theoretical and applied components 4

Theoretical CL • Formal theories about the linguistic knowledge that a human needs for

Theoretical CL • Formal theories about the linguistic knowledge that a human needs for generating and understanding language • Simulation of aspects of the human language faculty and their implementation as computer programs • Overlaps and collaborates with Theoretical Linguistics, Computer Science, Psycholinguistics 5

Applied CL • Focuses on the practical outcome of modeling human language use –

Applied CL • Focuses on the practical outcome of modeling human language use – aka language engineering or human language technology • Existing CL systems are far from achieving human ability but there are numerous possible and useful applications – Question/answering, summarization, translation, computer agents, educational applications etc 6

Why is language so difficult for a computer? AMBIGUITY! • Natural languages are massively

Why is language so difficult for a computer? AMBIGUITY! • Natural languages are massively ambiguous at all levels of processing (but humans don’t even notice…) • To resolve ambiguity, humans employ not only a detailed knowledge of the language -- sounds, phonological rules, grammar, lexicon etc - but also: – Detailed knowledge of the world (e. g. knowing that apples can have bruises but not smiles, or that snow falls but London does not). – The ability to follow a 'story', by connecting up sentences to form a continuous whole, inferring missing parts. – The ability to infer what a speaker meant, even if he/she did not actually say it. • It is these factors that make NLs so difficult to process by computer -but therefore so fascinating to study. 7

Syntactic ambiguity • I saw her duck • The man closed the door with

Syntactic ambiguity • I saw her duck • The man closed the door with a bang • The man closed the door with the black and white stripes • I saw the man with the telescope 8

Semantic ambiguity • The man went over to the bank • Mary loved Bill.

Semantic ambiguity • The man went over to the bank • Mary loved Bill. Mary loved potato chips. • Water runs down the hill. The road runs down the hill 9

Phonological ambiguity • Within words – Input, intake, income – Imput, intake, i. Ncome

Phonological ambiguity • Within words – Input, intake, income – Imput, intake, i. Ncome (N=ng) • Across word boundaries – When playing football, watch the referee – When talking about other people, watch who’s listening – When catching a hard ball, wear gloves • Homophones – I’m a writer and I write books – I’m a rider and I write books 10

11

11

Discourse • Anaphora – London had snow yesterday • It also had fog •

Discourse • Anaphora – London had snow yesterday • It also had fog • It fell to a depth of one meter • It will continue cold today • Speaker intentions – Can you swim – Can you tell me the time? – Can you pass the salt? • Inference – You shouldn’t lend John any books. He never returns them. 12

Language technology • ALICE the chatbox – http: //www. alicebot. org/ • Jabberwacky –

Language technology • ALICE the chatbox – http: //www. alicebot. org/ • Jabberwacky – http: //www. jabberwacky. com/ • USC demo for learning Arabic – http: //www. isi. edu/%7 Ejmoore/Mankin. TLWeb. mov 13

Anaphoric uses of pronouns • Bound variables: Non-referring • Referential pronouns: Reference to a

Anaphoric uses of pronouns • Bound variables: Non-referring • Referential pronouns: Reference to a contextually salient individual • Deixis • Co-reference 14

Bound variables • Non-referring pronouns – This type of pronoun does not refer to

Bound variables • Non-referring pronouns – This type of pronoun does not refer to an individual – Non-referring pronouns are interpreted according to rules in the grammar (1) Every man put a screen in front of him. 15

Referential pronouns (1) • Deictic (uttered immediate after a certain man left the room)

Referential pronouns (1) • Deictic (uttered immediate after a certain man left the room) (2) I’m glad he’s gone! 16

Referential pronouns (2) • Coreference (3) I don’t think anybody here is interested in

Referential pronouns (2) • Coreference (3) I don’t think anybody here is interested in Smith’s work. He should not be invited. (4) Most accidents that Mary reported were causes by her cat 17

 • In this course we will focus on understanding how we interpret referential

• In this course we will focus on understanding how we interpret referential pronouns 18

Basic theoretical models • Structural focusing (Grosz, Joshi & Weinstein, 1983/1995) – Centering: relating

Basic theoretical models • Structural focusing (Grosz, Joshi & Weinstein, 1983/1995) – Centering: relating discourse structure, discourse coherence and choice of referring expression. (11) John helped George wash the car. (12) He washed the windows and George waxed the car. (13) He soaped a pane/#He buffed the hood • Semantic/pragmatic focusing (Stevenson et al, 1994/2000) – Verbs and connectives have focusing properties (14) John criticized Bill because he failed to correct his faults 19

Challenges 1. Max is waiting for Fred. 2. He invited him for dinner. (Brennan

Challenges 1. Max is waiting for Fred. 2. He invited him for dinner. (Brennan et al, 1987) 3. Dodge was robbed by an ex convict. The ex-convict tied him up because he wasn't cooperating. Then he took all the money and ran (Suri et al, 1999) 20

…continued 7. John criticized Bill, so he tried to correct the fault. 8. Bill

…continued 7. John criticized Bill, so he tried to correct the fault. 8. Bill was criticized by John so he tried to correct the fault 9. John criticized Bill. Next, he insulted Susan. (Stevenson et al, 2000) 10. Max despises Ross a. He always gives Ross a hard time. (easy) b. He always gives Max a hard time. (hard) (D’Zmura and Tanenhaus, 1998) 21

 • How can we find out what people do when assigning the correct

• How can we find out what people do when assigning the correct interpretation to pronouns? • Are there cross-linguistic differences? 22

Some methodological approaches • Corpus based investigation • Experimental • Statistical 23

Some methodological approaches • Corpus based investigation • Experimental • Statistical 23

Tentative syllabus • Theories of pronoun interpretation (6) – Linguistic/Cross linguistic – Computational –

Tentative syllabus • Theories of pronoun interpretation (6) – Linguistic/Cross linguistic – Computational – Psycholinguistic • • Background readings for student projects (2) Lab work (2) Current systems for anaphora resolution (2) Review (1) 24

Course projects • You can pick either a corpus-based or experimental method to investigate

Course projects • You can pick either a corpus-based or experimental method to investigate some aspects of pronoun resolution in English or Greek or both 25

Evaluation • 3 tests/homeworks (30%) • Mid-term exam (30%) • Course project (40%) 26

Evaluation • 3 tests/homeworks (30%) • Mid-term exam (30%) • Course project (40%) 26