Introduction to Natural Language Processing Cheng Xiang Zhai

  • Slides: 21
Download presentation
Introduction to Natural Language Processing Cheng. Xiang Zhai Department of Computer Science University of

Introduction to Natural Language Processing Cheng. Xiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign 1

Lecture Plan • What is NLP? • A brief history of NLP • The

Lecture Plan • What is NLP? • A brief history of NLP • The current state of the art • NLP and text information systems 2

What is NLP? Thai: …��� ������ … How can a computer make sense out

What is NLP? Thai: …��� ������ … How can a computer make sense out of this string ? - What are the basic units of meaning (words)? Morphology - What is the meaning of each word? Syntax - How are words related with each other? Semantics - What is the “combined meaning” of words? Pragmatics - What is the “meta-meaning”? (speech act) Discourse - Handling a large chunk of text Inference - Making sense of everything 3

An Example of NLP A dog is chasing a boy on the playground Det

An Example of NLP A dog is chasing a boy on the playground Det Noun Aux Noun Phrase Complex Verb Semantic analysis Dog(d 1). Boy(b 1). Playground(p 1). Chasing(d 1, b 1, p 1). + Scared(x) if Chasing(_, x, _). Scared(b 1) Inference Verb Det Noun Prep Det Noun Phrase Lexical analysis (part-of-speech tagging) Prep Phrase Verb Phrase Syntactic analysis (Parsing) Verb Phrase Sentence A person saying this may be reminding another person to get the dog back… Pragmatic analysis (speech act) 4

If we can do this for all the sentences, then … BAD NEWS: Unfortunately,

If we can do this for all the sentences, then … BAD NEWS: Unfortunately, we can’t. General NLP = “AI-Complete” 5

NLP is Difficult!! • Natural language is designed to make human communication efficient. As

NLP is Difficult!! • Natural language is designed to make human communication efficient. As a result, – we omit a lot of “common sense” knowledge, which we assume the hearer/reader possesses – we keep a lot of ambiguities, which we assume the hearer/reader knows how to resolve • This makes EVERY step in NLP hard – Ambiguity is a “killer”! – Common sense reasoning is pre-required 6

Examples of Challenges • Word-level ambiguity: E. g. , – “design” can be a

Examples of Challenges • Word-level ambiguity: E. g. , – “design” can be a noun or a verb (Ambiguous POS) – “root” has multiple meanings (Ambiguous sense) • Syntactic ambiguity: E. g. , – “natural language processing” (Modification) – “A man saw a boy with a telescope. ” (PP Attachment) • Anaphora resolution: “John persuaded Bill to buy a TV for himself. ” (himself = John or Bill? ) • Presupposition: “He has quit smoking. ” implies that he smoked before. 7

Despite all the challenges, research in NLP has also made a lot of progress…

Despite all the challenges, research in NLP has also made a lot of progress… 8

High-level History of NLP • • Early enthusiasm (1950’s): Machine Translation – Too ambitious

High-level History of NLP • • Early enthusiasm (1950’s): Machine Translation – Too ambitious – Bar-Hillel report (1960) concluded that fully-automatic high-quality translation could not be accomplished without knowledge (Dictionary + Encyclopedia) Less ambitious applications (late 1960’s & early 1970’s): Limited success, failed to scale up – Speech recognition Deep understanding in – Dialogue (Eliza) Shallow understanding – Inference and domain knowledge (SHRDLU=“block world”) limited domain • Real world evaluation (late 1970’s – now) • Current trend: – Story understanding (late 1970’s & early 1980’s) Knowledge representation – Large scale evaluation of speech recognition, text retrieval, information extraction (1980 – now) Robust component techniques – Statistical approaches enjoy more success (first in speech recognition & retrieval, later others) Stat. language models – – Learning-based NLP Heavy use of machine learning techniques Boundary between statistical and symbolic approaches is disappearing. Applications We need to use all the available knowledge Application-driven NLP research (bioinformatics, Web, Question answering…) 9

The State of the Art A dog is chasing a boy on the playground

The State of the Art A dog is chasing a boy on the playground Det Noun Aux Noun Phrase Verb Complex Verb Det Noun Prep Det Noun Phrase Noun POS Tagging: 97% Noun Phrase Prep Phrase Verb Phrase Parsing: partial >90%(? ) Semantics: some aspects - Entity/relation extraction - Word sense disambiguation - Anaphora resolution Verb Phrase Sentence Speech act analysis: ? ? ? Inference: ? ? ? 10

Technique Showcase: POS Tagging Training data (Annotated text) This sentence Det N annotated text…

Technique Showcase: POS Tagging Training data (Annotated text) This sentence Det N annotated text… V 2 N This is a new sentence Det Aux Det Adj N POS Tagger “This is a new sentence” Consider all possibilities, and pick the one with the highest probability serves as an example of V 1 P Det N P This is a new sentence Method 1: Independent assignment Det Det Det Adj N V 2 V 2 Det …… Det Aux …… V 2 Most common tag Method 2: Partial dependency w 1=“this”, w 2=“is”, …. t 1=Det, t 2=Det, …, 11

Technique Showcase: Parsing Grammar Lexicon 1. 0 S NP VP 0. 3 NP Det

Technique Showcase: Parsing Grammar Lexicon 1. 0 S NP VP 0. 3 NP Det BNP 0. 4 NP BNP 0. 3 NP NP PP BNP N … VP V VP Aux V NP … VP PP PP P NP 1. 0 Generate V chasing 0. 01 Aux is N dog 0. 003 N boy N playground … Det the … Det a P on S Probability of this tree=0. 000015 NP VP Det BNP A N VP Aux dog PP V is chasing NP P NP on a boy the playground S NP Det A VP BNP N Aux is Choose a tree with highest prob…. NP V chasing NP dog Can also be treated as a classification/decision problem… a boy PP P NP on the playground roller skates 12

Semantic Analysis Techniques • Only successful for VERY limited domain or for SOME aspect

Semantic Analysis Techniques • Only successful for VERY limited domain or for SOME aspect of semantics • E. g. , – Entity extraction (e. g. , recognizing a person’s name): Use rules and/or machine learning – Word sense disambiguation: addressed as a classification problem with supervised learning – Sentiment tagging – Anaphora resolution … In general, exploiting machine learning and statistical language models… 13

What We Can’t Do • 100% POS tagging – “He turned off the highway.

What We Can’t Do • 100% POS tagging – “He turned off the highway. ” vs “He turned off the fan. ” • General complete parsing – “A man saw a boy with a telescope. ” • Precise deep semantic analysis – Will we ever be able to precisely define the meaning of “own” in “John owns a restaurant. ”? Robust & general NLP tends to be “shallow”, while “deep” understanding doesn’t scale up 14

Major NLP Applications • • • Speech recognition: e. g. , Auto telephone call

Major NLP Applications • • • Speech recognition: e. g. , Auto telephone call routing Text management and analysis – – – Text retrieval/filtering Text classification Text summarization Text mining Query answering Our focus Language tutoring – Spelling/grammar correction Machine translation – Cross-language retrieval – Restricted natural language Natural language user interface 15

NLP & Text Information Systems Better NLP => Better Text Information Systems Bad NLP

NLP & Text Information Systems Better NLP => Better Text Information Systems Bad NLP => Bad Text Information Systems? Robust shallow NLP tends to be more useful than fragile deep NLP. Errors in NLP can hurt a text information system 16

How Much NLP is Really Needed? Tasks Dependency on NLP “Easier” & More “workarounds”

How Much NLP is Really Needed? Tasks Dependency on NLP “Easier” & More “workarounds” Classification/ Retrieval Summarization/ Extraction/ Topic Mining Translation/ Dialogue Question Answering 17

Workaround Example I: ELIZA plays the role of a therapist How does it work?

Workaround Example I: ELIZA plays the role of a therapist How does it work? Person : Men are all alike. Heuristic rules based on ELIZA : In what way? Keyword matching! E. g. , Person : They're always bugging us about something or other. ELIZA : Can you think of a specific example? I remember X => Person : Well, my boyfriend made me come here. Do you often think of X? ELIZA : Your boyfriend made you come here? Person : He says I'm depressed much of the time. always => ELIZA : I am sorry to hear that you are depressed. Can you think of a specific example? Person : It's true. I am unhappy. ELIZA : Do you think coming here will help you not to be It keeps the dialogue going on, unhappy? but NEVER makes any claim! Person : I need some help, that much seems certain. ELIZA : What would it mean to you if you got some help? A breakdown example: Person : Perhaps I could learn to get along with my mother. Person : Failure is the mother of success. ELIZA : Tell me more about your family No NLP, but useful. Perhaps we should call this NLP? Statistical NLP often has a similar flavor with “SOFT” rules LEARNED from data 18

Workaround Example II: Statistical Translation Learn how to translate Chinese to English from many

Workaround Example II: Statistical Translation Learn how to translate Chinese to English from many example translations Intuitions: - If we have seen all possible translations, then we simply lookup - If we have seen a similar translation, then we can adapt - If we haven’t seen any example that’s similar, we try to generalize what we’ve seen All these intuitions are captured through a probabilistic model English Speaker P(E) English Words (E) Chinese Words(C) Noisy Channel Translator P(C|E) P(E|C)=? English Translation 19

So, what NLP techniques are most useful for text information systems? Statistical NLP in

So, what NLP techniques are most useful for text information systems? Statistical NLP in general, and statistical language models in particular The need for high robustness and efficiency implies the dominant use of simple models (i. e. , unigram models) 20

What You Should Know • • NLP is the foundation of text information systems

What You Should Know • • NLP is the foundation of text information systems – Better NLP enables better text management – Better NLP is necessary for sophisticated tasks But – Bad NLP doesn’t mean bad text information systems – There are often “workarounds” for a task – Inaccurate NLP can hurt the performance of a task The most effective NLP techniques are often statistical with the help of linguistic knowledge The challenge is to bridge the gap between imperfect NLP and useful application functions 21