Lecture 1 Introduction KaiWei Chang CS University of

  • Slides: 51
Download presentation
Lecture 1: Introduction Kai-Wei Chang CS @ University of Virginia kw@kwchang. net Couse webpage:

Lecture 1: Introduction Kai-Wei Chang CS @ University of Virginia kw@kwchang. net Couse webpage: http: //kwchang. net/teaching/NLP 16 CS 6501– Natural Language Processing 1

Announcements v Waiting list: Start attending the first few meetings of the class as

Announcements v Waiting list: Start attending the first few meetings of the class as if you are registered. Given that some students will drop the class, some space will free up. v We will use Piazza as an online discussion platform. Please enroll. CS 6501– Natural Language Processing 2

Staff v Instructor: Kai-Wei Chang v Email: nlp 16@kwchang. net v Office: R 412

Staff v Instructor: Kai-Wei Chang v Email: nlp 16@kwchang. net v Office: R 412 Rice Hall v Office hour: 2: 00 – 3: 00, Tue (after class). v Additional office hour: 3: 00 – 4: 00, Thu v TA: Wasi Ahmad v Email: wua 4 nw@virginia. edu v Office: R 432 Rice Hall v Office hour: 4: 00 – 5: 00, Mon CS 6501– Natural Language Processing 3

This lecture v Course Overview v What is NLP? Why it is important? v

This lecture v Course Overview v What is NLP? Why it is important? v What will you learn from this course? v Course Information v What are the challenges? v Key NLP components CS 6501– Natural Language Processing 4

What is NLP v Wiki: Natural language processing (NLP) is a field of computer

What is NLP v Wiki: Natural language processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human (natural) languages. CS 6501– Natural Language Processing 5

Go beyond the keyword matching v Identify the structure and meaning of words, sentences,

Go beyond the keyword matching v Identify the structure and meaning of words, sentences, texts and conversations v Deep understanding of broad language v NLP is all around us CS 6501– Natural Language Processing 6

Machine translation Facebook translation, image credit: Meedan. org CS 6501– Natural Language Processing 7

Machine translation Facebook translation, image credit: Meedan. org CS 6501– Natural Language Processing 7

Statistical machine translation Image credit: Julia Hockenmaier, Intro to NLP CS 6501– Natural Language

Statistical machine translation Image credit: Julia Hockenmaier, Intro to NLP CS 6501– Natural Language Processing 8

Dialog Systems CS 6501– Natural Language Processing 9

Dialog Systems CS 6501– Natural Language Processing 9

Sentiment/Opinion Analysis CS 6501– Natural Language Processing 10

Sentiment/Opinion Analysis CS 6501– Natural Language Processing 10

Text Classification www. wired. com v Other applications? CS 6501– Natural Language Processing 11

Text Classification www. wired. com v Other applications? CS 6501– Natural Language Processing 11

Question answering 'Watson' computer wins at 'Jeopardy' credit: ifunny. com CS 6501– Natural Language

Question answering 'Watson' computer wins at 'Jeopardy' credit: ifunny. com CS 6501– Natural Language Processing 12

Question answering v Go beyond search CS 6501– Natural Language Processing 13

Question answering v Go beyond search CS 6501– Natural Language Processing 13

Natural language instruction https: //youtu. be/Kk. OCe. At. KHIc? t=1 m 28 s CS

Natural language instruction https: //youtu. be/Kk. OCe. At. KHIc? t=1 m 28 s CS 6501– Natural Language Processing 14

Digital personal assistant More on natural language instruction credit: techspot. com v Semantic parsing

Digital personal assistant More on natural language instruction credit: techspot. com v Semantic parsing – understand tasks v Entity linking – “my wife” = “Kellie” in the phone book CS 6501– Natural Language Processing 15

Information Extraction v Unstructured text to database entries Yoav Artzi: Natural language processing CS

Information Extraction v Unstructured text to database entries Yoav Artzi: Natural language processing CS 6501– Natural Language Processing 16

Language Comprehension Christopher Robin is alive and well. He is the same person that

Language Comprehension Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book v Q: who wrote Winnie the Pooh? v Q: where is Chris lived? CS 6501– Natural Language Processing 17

What will you learn from this course v The NLP Pipeline v Key components

What will you learn from this course v The NLP Pipeline v Key components for understanding text v NLP systems/applications v Current techniques & limitation v Build realistic NLP tools CS 6501– Natural Language Processing 18

What’s not covered by this course v Speech recognition – no signal processing v

What’s not covered by this course v Speech recognition – no signal processing v Natural language generation v Details of ML algorithms / theory v Text mining / information retrieval CS 6501– Natural Language Processing 19

This lecture v Course Overview v What is NLP? Why it is important? v

This lecture v Course Overview v What is NLP? Why it is important? v What will you learn from this course? v Course Information v What are the challenges? v Key NLP components CS 6501– Natural Language Processing 20

Overview v New course, first time being offered v Comments are welcomed v Aimed

Overview v New course, first time being offered v Comments are welcomed v Aimed at first- or second- year Ph. D students v Lecture + Seminar v No course prerequisites, but I assume v programming experience (for the final project) v basics of probability calculus, and linear algebra (HW 0) CS 6501– Natural Language Processing 21

Grading v No exam & HW -- hooray v Lectures & forum v Participate

Grading v No exam & HW -- hooray v Lectures & forum v Participate in discussion (additional credits) v Review quizzes (25%): 3 quizzes v Critical review report (10%) v Paper presentation (15%) v Final project (50%) CS 6501– Natural Language Processing 22

Quizzes v Format v Multiple choice questions v Fill-in-the-blank v Short answer questions v

Quizzes v Format v Multiple choice questions v Fill-in-the-blank v Short answer questions v Each quiz: ~20 min in class v Schedule: see course website v Closed book, Closed notes, Closed laptop CS 6501– Natural Language Processing 23

Critical review report v 1 page maximum v Pick one paper from the suggested

Critical review report v 1 page maximum v Pick one paper from the suggested list v Summarize the paper (use you own words) v Provide detailed comments v What can be improved v Potential future directions v Other related work v Some students will be selected to present their critical reviews CS 6501– Natural Language Processing 24

Paper presentation v Each group has 2~3 students v Picked one paper from the

Paper presentation v Each group has 2~3 students v Picked one paper from the suggested readings, or your favorite paper v Cannot be the same as critical review report v Can be related to your final project v Register your choice early v 15 min presentation + 2 mins Q&A v Will be graded by the instructor, TA, other students CS 6501– Natural Language Processing 25

Final Project v Work in groups (2~3 students) v Project proposal v Written report,

Final Project v Work in groups (2~3 students) v Project proposal v Written report, 2 page maximum v Project report (35%) v < 8 pages, ACL format v Due 2 days before the final presentation v Project presentation (15%) v 5 -min in-class presentation (tentative) CS 6501– Natural Language Processing 26

Late Policy v Credit of 48 hours for all the assignments v Including proposal

Late Policy v Credit of 48 hours for all the assignments v Including proposal and final project v No accumulation v No more grace period v No make-up exam v unless under emergency situation CS 6501– Natural Language Processing 27

Cheating/Plagiarism v No. Ask if you have concerns v UVA Honor Code: http: //www.

Cheating/Plagiarism v No. Ask if you have concerns v UVA Honor Code: http: //www. virginia. edu/honor/ CS 6501– Natural Language Processing 28

Lectures and office hours v Participation is highly appreciated! v Ask questions if you

Lectures and office hours v Participation is highly appreciated! v Ask questions if you are still confusing v Feedbacks are welcomed v Lead the discussion in this class v Enroll Piazza https: //piazza. com/virginia/fall 2016/cs 6501004 CS 6501– Natural Language Processing 29

Topics of this class v Fundamental NLP problems v Machine learning & statistical approaches

Topics of this class v Fundamental NLP problems v Machine learning & statistical approaches for NLP v NLP applications v Recent trend in NLP CS 6501– Natural Language Processing 30

What to Read? v Natural Language Processing ACL, NAACL, EMNLP, Co. NLL, Coling, TACL

What to Read? v Natural Language Processing ACL, NAACL, EMNLP, Co. NLL, Coling, TACL aclweb. org/anthology v Machine learning ICML, NIPS, ECML, AISTATS, ICLR, JMLR, MLJ v Artificial Intelligence AAAI, IJCAI, UAI, JAIR CS 6501– Natural Language Processing 31

Questions? CS 6501– Natural Language Processing 32

Questions? CS 6501– Natural Language Processing 32

This lecture v Course Overview v What is NLP? Why it is important? v

This lecture v Course Overview v What is NLP? Why it is important? v What will you learn from this course? v Course Information v What are the challenges? v Key NLP components CS 6501– Natural Language Processing 33

Challenges – ambiguity v Word sense ambiguity CS 6501– Natural Language Processing 34

Challenges – ambiguity v Word sense ambiguity CS 6501– Natural Language Processing 34

Challenges – ambiguity v Word sense / meaning ambiguity Credit: http: //stuffsirisaid. com CS

Challenges – ambiguity v Word sense / meaning ambiguity Credit: http: //stuffsirisaid. com CS 6501– Natural Language Processing 35

Challenges – ambiguity v PP attachment ambiguity Credit: Mark Liberman, http: //languagelog. ldc. upenn.

Challenges – ambiguity v PP attachment ambiguity Credit: Mark Liberman, http: //languagelog. ldc. upenn. edu/nll/? p=17711 CS 6501– Natural Language Processing 36

Challenges -- ambiguity v Ambiguous headlines: v Include your children when baking cookies v

Challenges -- ambiguity v Ambiguous headlines: v Include your children when baking cookies v Local High School Dropouts Cut in Half v Hospitals are Sued by 7 Foot Doctors v Iraqi Head Seeks Arms v Safety Experts Say School Bus Passengers Should Be Belted v Teacher Strikes Idle Kids CS 6501– Natural Language Processing 37

Challenges – ambiguity v Pronoun reference ambiguity Credit: http: //www. printwand. com/blog/8 -catastrophic-examples-of-word-choice-mistakes CS

Challenges – ambiguity v Pronoun reference ambiguity Credit: http: //www. printwand. com/blog/8 -catastrophic-examples-of-word-choice-mistakes CS 6501– Natural Language Processing 38

Challenges – language is not static v Language grows and changes v e. g.

Challenges – language is not static v Language grows and changes v e. g. , cyber lingo LOL G 2 G BFN B 4 N Idk FWIW LUWAMH Laugh out loud Got to go Bye for now I don’t know For what it’s worth Love you with all my heart CS 6501– Natural Language Processing 39

Challenges--language is compositional Carefully Slide CS 6501– Natural Language Processing 40

Challenges--language is compositional Carefully Slide CS 6501– Natural Language Processing 40

Challenges--language is compositional 小心: Carefully Careful Take Care Caution CS 6501– Natural Language Processing

Challenges--language is compositional 小心: Carefully Careful Take Care Caution CS 6501– Natural Language Processing 地滑: Slide Landslip Wet Floor Smooth 41

Challenges – scale v Examples: v Bible (King James version): ~700 K v Penn

Challenges – scale v Examples: v Bible (King James version): ~700 K v Penn Tree bank ~1 M from Wall street journal v Newswire collection: 500 M+ v Wikipedia: 2. 9 billion word (English) v Web: several billions of words CS 6501– Natural Language Processing 42

This lecture v Course Overview v What is NLP? Why it is important? v

This lecture v Course Overview v What is NLP? Why it is important? v What will you learn from this course? v Course Information v What are the challenges? v Key NLP components CS 6501– Natural Language Processing 43

Part of speech tagging CS 6501– Natural Language Processing 44

Part of speech tagging CS 6501– Natural Language Processing 44

Syntactic (Constituency) parsing CS 6501– Natural Language Processing 45

Syntactic (Constituency) parsing CS 6501– Natural Language Processing 45

Syntactic structure => meaning Image credit: Julia Hockenmaier, Intro to NLP CS 6501– Natural

Syntactic structure => meaning Image credit: Julia Hockenmaier, Intro to NLP CS 6501– Natural Language Processing 46

Dependency Parsing CS 6501– Natural Language Processing 47

Dependency Parsing CS 6501– Natural Language Processing 47

Semantic analysis v Word sense disambiguation v Semantic role labeling Credit: Ivan Titov CS

Semantic analysis v Word sense disambiguation v Semantic role labeling Credit: Ivan Titov CS 6501– Natural Language Processing 48

Q: [Chris] = [Mr. Robin] ? Christopher Robin is alive and well. He is

Q: [Chris] = [Mr. Robin] ? Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book Slide modified from Dan Roth 49

Co-reference Resolution Christopher Robin is alive and well. He is the same person that

Co-reference Resolution Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book 50

Questions? CS 6501– Natural Language Processing 51

Questions? CS 6501– Natural Language Processing 51