Extracting Predicates from Semistructured and Unstructured Texts Clint
- Slides: 14
Extracting Predicates from Semistructured and Unstructured Texts Clint Tustison BYU DEG Funded in part by the NSF 1
Introduction n Vast amount of electronic data n Semi-structured n n n GEDCOM files (format for encoding genealogical information) Clinical Trials Unstructured n n Newspaper headlines Thematic discourse (Wall Street Journal articles) 2
Questions n What current methods are employed for extracting electronic data? n What is a workable solution for the representation of the extracted information? 3
Why worry about representation? n Ambiguities abound n n BYU panel discusses war with Iraq Sisters reunited after 18 years in checkout counter Everybody loves somebody Differentiate meanings of an utterance A Mary B Fred C Mark … A B C D E Fred 4
Approach n Tools n Link Grammar Parser n n n Provides a syntactic dependency parse Semantics is interpretive (gets read from the syntax) Predicate logic n n Formal properties, allow for wide range of applications, usable crosslinguistically Vocabulary, syntax, semantics n n First-order: quantification over individuals (FOPC) Higher-order: quantification over relations, etc. 5
Link Grammar Parser n Sleator, Lafferty, Temperley n Benefits n n written in C very fast Robust - ability to process (un)grammaticality / spelling errors Free - http: //www. link. cs. cmu. edu/link Easily integrated 6
Link Grammar Parser linkparser> the dog ate the food. +--------Xp--------+ +-----Wd----+ +----Os----+ | | +-Ds-+--Ss-+ +-D*u-+ | | | | LEFT-WALL the dog. n ate. v the food. n. ate (dog, food). 7
Clinical Trials Extraction n n Novel Adjuvants for Peptide-Based Melanoma Vaccines INCLUSION CRITERIA: Ages Eligible for Study: 18 Years and above , Genders Eligible for Study: Both Diagnosis of stage III or IV cutaneous, mucosal, or ocular melanoma Granulocyte count at least 1, 500/mm 3 Platelet count at least 100, 000/mm 3 EXCLUSION CRITERIA: Steroid therapy or other immunosuppressive medication requirement Allergic reaction to Montanide ISA 51 (incomplete Freund's adjuvant) Positive for hepatitis B surface antigen, hepatitis C antibody, or HIV antibody 8
Predicates: Inclusion Criteria Ages Eligible for Study: 18 Years and above , Genders Eligible for Study: Both n age(Person, X) & X >= 18. gender(Person, X) & (female == X || male == X). Diagnosis of stage III or IV cutaneous, mucosal, or ocular melanoma n diagnosis(Person, X) & melanoma(X) & type(X, Y) & (cutaneous(Y) || mucosal(Y) || ocular(Y)) & stage(X, Z) & (Z == 3 || Z == 4). 9
Predicates: Exclusion Criteria Allergic reaction to Montanide ISA 51 (incomplete Freund's adjuvant) n Steroid therapy or & other immunosuppressive medication ¬(allergy(Person, X) montanide(X)). requirement n Positive for hepatitis surface antigen, hepatitis C antibody, or ¬(therapy(Person, X) & Bsteroid(X)). HIV antibody n ¬(condition(Person, X) & hepatitis_B(X) || hepatitis_C(X) || hiv(X)). 10
News Headlines Extraction n n n Bangladesh frees UK journalists n frees(bangladesh, uk_journalists). Lieberman mulls 2004 bid n mulls(lieberman, 2004_bid). Avalanche kills snowboarder in Nevada n kills(avalanche, snowboarder, nevada). Pope tackles US sex abuse n tackles(pope, us_sex_abuse). Hubble watches galactic dance n watches(hubble, dance) & galactic(dance). Mbeki bemoans racial divisions n bemoans(mbeki, divisions) & racial(divisions). 11
GEDCOM Extraction n individual(i 1, name('Dovie MELLISSIA /STEVENSON/'), sex(f), parentin(f 1), childin(f 2), birthdate('18 Sep 1908'), baptismdate('10 Apr 1919'), endowdate('9 Mar 1976'), deathdate(''), birthplace('OKTAHA, MUSKOGEE, OK, USA'), deathplace(''), burialplace('')). n individual(i 2, name('WILLIAM JAMES /STEVENSON/'), sex(m), parentin(f 4), childin(f 5), birthdate('5 Sep 1880'), baptismdate('13 Sep 1903'), endowdate('9 May 1969'), deathdate('22 Nov 1964'), birthplace('PENDLETON, WARREN, PA'), deathplace('TULARE, CA'), burialplace('VISALIA, TULARE, CA')). n individual(i 3, name('/MAHLER/'), sex(m), parentin(f 6), childin(f 5), birthdate('5 Sep 1880'), baptismdate('13 Sep 1903'), endowdate('9 May 1969'), deathdate('22 Nov 1964'), birthplace('PENDLETON, WARREN, PA'), deathplace('TULARE, CA'), burialplace('VISALIA, TULARE, CA')). 12
Inferencing /************************************** Which husband/wife combination was born on the same day in the same place? **************************************/ husband_wife(Husband. Name, HBirthdate, Wife. Name, WBirthdate, X) : individual(Husband, name(Husband. Name), _, _, _, birthdate(HBirthdate), _, _, _, birthplace(X), _, _), family(_, husband(Husband), _, _), parse_date(HBirthdate, HDay, HMonth, HYear), individual(Wife, name(Wife. Name), _, _, _, birthdate(WBirthdate), _, _, _, birthplace(X), _, _), family(_, _, wife(Wife), _), parse_date(WBirthdate, WDay, WMonth, WYear), HYear == WYear, HMonth == WMonth, HDay == WDay. Husband. Name = Garland /Bailey/ HBirthdate = 16 Apr 1912 Wife. Name = Carolyn /Warren/ WBirthdate = 16 Apr 1912 Place = Gracemont, Caddo, Oklahoma Husband. Name = Charles Arthur /Goodpasture/ HBirthdate = 25 Dec 1894 Wife. Name = Betty Lucille /Rittga/ WBirthdate = 25 Dec 1894 Place = Gracemont, Caddo, Oklahoma 13
Contribution/Future Work n Contributions n Robustly extract predicates from natural language n n n Use applications to access predicates n n Multiple domains Various natural language syntactic constructions Inferencing and querying Future Work n n Extract predicates from other domains Integrate with external knowledge sources n n n Wordnet UMLS Upgrade to higher-order predicate calculus to allow predication over relations and events, not just individuals 14
- Semistructured decision
- Extracting data from xml
- Dna extraction from wheat germ
- Complete subject vs simple subject
- Complete and simple predicate
- Unit 1: subjects, predicates, and sentences answer key
- Compound predicate sentences examples
- How are limestone pavements formed
- Draw a line between the complete subject and predicate
- He pooled popcorn dipped in ketchup
- Predicate song
- Quantifiers
- Computable predicates in ai
- Subjects and predicates quiz
- Lesson 5 subjects and predicates compound answers