SemiSupervised Learning for Semantic Parsing using Support Vector

  • Slides: 20
Download presentation
Semi-Supervised Learning for Semantic Parsing using Support Vector Machines Rohit J. Kate Raymond J.

Semi-Supervised Learning for Semantic Parsing using Support Vector Machines Rohit J. Kate Raymond J. Mooney Machine Learning Group Department of Computer Sciences University of Texas at Austin

Outline • Brief Background – Semantic Parsing – KRISP: A Supervised Learning System –

Outline • Brief Background – Semantic Parsing – KRISP: A Supervised Learning System – Transductive SVMs • Semi-supervised Semantic Parsing • Experiments • Conclusions 2

Semantic Parsing • Transforming natural language (NL) sentences into computer executable complete meaning representations

Semantic Parsing • Transforming natural language (NL) sentences into computer executable complete meaning representations (MRs) for some application. • Geoquery: A database query application Which rivers run through the states bordering Texas? Arkansas, Canadian, Cimarron, Gila, Mississippi, Rio Grande … Answer Semantic Parsing answer(traverse(next_to(stateid(‘texas’)))) Query 3

Meaning Representation Language MR: answer(traverse(next_to(stateid(‘texas’)))) Parse tree of MR: ANSWER answer(RIVER) ANSWER RIVER answer

Meaning Representation Language MR: answer(traverse(next_to(stateid(‘texas’)))) Parse tree of MR: ANSWER answer(RIVER) ANSWER RIVER answer RIVER TRAVERSE(STATE) TRAVERSE traverse STATE NEXT_TO(STATE) NEXT_TO STATE NEXT_TOnext_to STATE STATEID stateid Productions: ANSWER answer(RIVER) STATE NEXT_TO(STATE) NEXT_TO next_to ‘texas’ STATEID RIVER TRAVERSE(STATE) TRAVERSE traverse STATEID ‘texas’ 4

KRISP: Supervised Semantic Parser Learner • KRISP: Kernel-based Robust Interpretation for Semantic Parsing [Kate

KRISP: Supervised Semantic Parser Learner • KRISP: Kernel-based Robust Interpretation for Semantic Parsing [Kate & Mooney 2006] • Takes NL sentences paired with their MRs as training data • Treats the formal MR language grammar’s productions as semantic concepts • Trains an SVM classifier for each production with string subsequence kernel [Lodhi et al. 2002] 5

Semantic Parsing by KRISP • SVM classifier for each production gives the probability that

Semantic Parsing by KRISP • SVM classifier for each production gives the probability that a substring represents the semantic concept of the production NEXT_TO next_to. NEXT_TO 0. 02 next_to NEXT_TO 0. 01 next_to 0. 95 Which rivers run through the states bordering Texas? 6

Semantic Parsing by KRISP • SVM classifier for each production gives the probability that

Semantic Parsing by KRISP • SVM classifier for each production gives the probability that a substring represents the semantic concept of the production TRAVERSE traverse 0. 91 0. 21 TRAVERSE traverse Which rivers run through the states bordering Texas? 7

Semantic Parsing by KRISP • Most probable derivation is found efficiently by an extended

Semantic Parsing by KRISP • Most probable derivation is found efficiently by an extended version of Earley’s parsing algorithm [Kate & Mooney 2006] ANSWER answer(RIVER) 0. 89 RIVER TRAVERSE(STATE) 0. 92 TRAVERSE traverse 0. 91 STATE NEXT_TO(STATE) 0. 81 NEXT_TO next_to 0. 95 STATEID 0. 98 STATEID ‘texas’ 0. 99 Which rivers run through the states bordering Texas? Probability of the derivation is the product of the probabilities 8 at the nodes.

SVM String Classifiers • Positive and negative examples are collected from the training data

SVM String Classifiers • Positive and negative examples are collected from the training data and are iteratively refined [Kate & Mooney Separating hyperplane Production: NEXT_TO next_to - 2006] state with the capital of area larger than through which - - + - states that are next to + + + the states next to states that border the states bordering + states that share border Probability estimate of an example belonging to a class can be obtained using its distance from the hyperplane [Platt, 1999] 9

Transductive SVMs • Using unlabeled test examples during training can help find a better

Transductive SVMs • Using unlabeled test examples during training can help find a better hyperplane [Joachims 1999] - + - - - + + + 10

Transductive SVMs contd. • Find a labeling that separates all the examples with maximum

Transductive SVMs contd. • Find a labeling that separates all the examples with maximum margin • Finding the exact solution is intractable but approximation algorithms exist [Joachims 1999], [Chen et al. 2003], [Collobert et al. 2006] • Transductive learning maximizes the accuracy on the unlabeled examples • Can be used in a semi-supervised framework if the unlabeled examples come from the same distribution as the test examples [Bennett and Demiriz, 1999] 11

Semi-Supervised Semantic Parsing • Utilize NL sentences not annotated with their MRs, usually cheaply

Semi-Supervised Semantic Parsing • Utilize NL sentences not annotated with their MRs, usually cheaply available • KRISP can be turned into a semi-supervised learner if the SVM classifiers are given appropriate unlabeled examples • Which substrings should be the unlabeled examples for which productions’ SVMs? 12

SEMISUP-KRISP: Semi-Supervised Semantic Parser Learner • First learns a semantic parser from the supervised

SEMISUP-KRISP: Semi-Supervised Semantic Parser Learner • First learns a semantic parser from the supervised data using KRISP 13

SEMISUP-KRISP: Semi-Supervised Semantic Parser Learner contd. Supervised Corpus KRISP Semantic Parsing Which rivers run

SEMISUP-KRISP: Semi-Supervised Semantic Parser Learner contd. Supervised Corpus KRISP Semantic Parsing Which rivers run through the states bordering Texas? answer(traverse(next_to(stateid(‘texas’)))) What is the lowest point of the state with the largest area? answer(lowest(place(loc(largest_one(area(state(all))))))) What is the largest city in states that border California? answer(largest(city(loc(next_to(stateid( 'california')))))) Collect labeled examples SVM classifiers …… --- Unsupervised Corpus Which states have a city named Springfield? What is the capital of the most populous state? + ++ + How many rivers flow through Mississippi? How many states does the Mississippi run through? How high is the highest point in the smallest state? Which rivers flow through the states that border California? ……. Semantic Parsing 14

SEMISUP-KRISP: Semi-Supervised Semantic Parser Learner • First learns a semantic parser from the supervised

SEMISUP-KRISP: Semi-Supervised Semantic Parser Learner • First learns a semantic parser from the supervised data using KRISP • Applies the learned parser on the unsupervised NL sentences • Whenever an SVM classifier is called to estimate the probability of a substring, that substring becomes an unlabeled example for that classifier TRAVERSE traverse next_to of examples • These substrings are NEXT_TO representative that the classifiers will encounter during testing Which rivers run through the states bordering Texas? 15

SEMISUP-KRISP: Semi-Supervised Semantic Parser Learner contd. Supervised Corpus Semantic Parsing Which rivers run through

SEMISUP-KRISP: Semi-Supervised Semantic Parser Learner contd. Supervised Corpus Semantic Parsing Which rivers run through the states bordering Texas? answer(traverse(next_to(stateid(‘texas’)))) What is the lowest point of the state with the largest area? answer(lowest(place(loc(largest_one(area(state(all))))))) What is the largest city in states that border California? answer(largest(city(loc(next_to(stateid( 'california')))))) Collect labeled examples Transductive SVM classifiers …… --- Unsupervised Corpus Which states have a city named Springfield? What is the capital of the most populous state? How many rivers flow through Mississippi? How many states does the Mississippi run through? How high is the highest point in the smallest state? Collect unlabeled examples + ++ + Learned Semantic parser Which rivers flow through the states that border California? ……. Semantic Parsing 16

Experiments • Compared the performance of SEMISUP-KRISP and KRISP on the Geoquery domain •

Experiments • Compared the performance of SEMISUP-KRISP and KRISP on the Geoquery domain • Corpus contains 250 NL sentences annotated with their correct MRs • Collected 1037 unannotated sentences from our web-based demo • Evaluated by 10 -fold cross validation keeping the unsupervised data same in each fold • Increased the amount of supervised training data and measured the best F-measure 17

Results 18

Results 18

Results 25% saving GEOBASE: Hand-built semantic parser [Borland International, 1988] 19

Results 25% saving GEOBASE: Hand-built semantic parser [Borland International, 1988] 19

Conclusions • Presented a semi-supervised approach to semantic parsing • Utilizes unannotated sentences by

Conclusions • Presented a semi-supervised approach to semantic parsing • Utilizes unannotated sentences by extracting unlabeled examples for the SVM classifiers • Classifiers are retrained using transductive SVMs • Improves the performance, particularly when the supervised data is limited 20