Working with Frames Annotating a German Corpus with
- Slides: 73
Working with Frames Annotating a German Corpus with Frames – and using them for NLP Anette Frank Computational Linguistics Department Saarland University Saarbrücken Language Technology Lab DFKI Gmb. H Saarbrücken SLTC 2006, Swedish Language Technology Conference Göteborg, 27 -28 Oct 2006
Working with Frames SALSA: Saarbrücken Lexical Semantics Acquisition Project – Funded by German Research Foundation, DFG (2004 – 2008) – Project team, objectives and background information Annotating a German Corpus with Frame. Net Frames … – Cross-language application: using English Frame. Net for German – Corpus-based approach • Special phenomena: non-compositionality and vagueness • Coverage problems – Consistency control – From Corpus to Lexicon … and using them in NLP applications – Automatic Frame and Role Assignment (Erk and Pado, 2006) – Frame Semantics for Textual Entailment (Burchardt and Frank, 2006) Conclusions and Outlook
The SALSA Project Team Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Manfred Pinkal and Sebastian Pado Motivation Alleviating the bottleneck in lexical-semantic resource creation for languages other than English Objectives I. Creation of a large semantically annotated corpus of German – Annotating frame semantic classes and roles from Berkeley Frame. Net on top of a syntactically analysed newspaper corpus (TIGER corpus) II. Creation of a semantic lexicon on the basis of corpus annotations – – Word Sense: Frame-semantic classifications of predicates Argument Structure: Semantic roles and syntactic realisation patterns III. Developing methods for automation and application of frame-semantic information in NLP applications
Frame Semantics (Fillmore 1976, Fillmore et al. 2003) – A frame represents a conceptual structure, or a prototypical situation, with a (frame-specific) set of roles that identify the participants or props involved in the situation – Frames are organised in a hierarchy, with various frame-to-frame relations • Inheritance, subframe (defining scenarios) – Frame. Net database: 600 frames, 8, 700 lexical units, 133, 846 annotated sents Commercegoods-transfer Seller BMW bought Rover from British Aerospace. Buyer Rover was bought by BMW, which financed [. . . ] the new Range Rover. Goods BMW, which acquired Rover in 1994, is now dismantling the company. Money BMW‘s purchase of Rover for $1. 2 billion was a good move.
Frame Definitions and Annotations
Frame. Net Hierarchy and Frame Relations
Role Inheritance and Perspectivization
Why Frame Semantics? Cross-linguistic aspects – Frame. Net’s conceptual classes bear high potential for cross-lingual applicability – Frames are linguistically motivated • Syntactic realisation of core semantic roles • Ontological constraints and “perspectivization” These properties may differ across languages Research issues – Cross-linguistic applicability of Frame. Net’s semantic inventory – Accounting for cross-linguistic divergences in a Multilingual Frame. Net Cross-lingual Frame. Net Group: Building Frame. Nets for English, German, Spanish, Japanese, French, …
Why Frame Semantics? Using Frame Semantics in NLP applications – Focusing on lexical semantic classes and role-based argument structure – Disregarding aspects of „deep“ semantics: negation, modality, quantification, . . . – Normalisation: syntactic alternations [Fred. Agent] hit. Cause_Impact [the ball. Impactee]. --- [The ball. Impactee] was hit. Cause_Impact [John. Donor] gave. Giving [Mary. Recipient] [a book. Theme]. [John. Donor] gave. Giving [a book. Theme] [to Mary. Recipient]. § Normalisation: lexical alternations (within and across part-of-speech) [Marylin. Speaker] spoke. Statement about [her past. Topic]. [Marylin. Speaker]‘s statement. Statement about [her past. Topic]. [Marylin. Speaker] talked. Statement about [her past. Topic]. Provides semantic classes (senses) within a semantic network, combined with argument structure information, at high abstraction level
Annotating a German Corpus with Frames Manual semantic annotation of a syntactically analysed corpus – TIGER Treebank (Universities of Saarbrücken, Stuttgart, Potsdam) – 1. 5 million words / 80 K sentences of newspaper text (Frankfurter Rundschau) – Combined constituent and dependency structure (edges labelled with grammatical functions), with crossing edges for flexible word order – Relatively flat trees SPD asks coalition to talk about reform
Annotation Scheme Annotating frames on top of syntactic structure • Frame REQUEST is evoked by discontinuous target word „fordert auf“ (ask, request) • Frame elements (roles) are connected to constituents • Flat semantic trees (depth 1) • Independent frames SPD asks coalition to talk about reform
Annotation Scheme Annotating frames on top of syntactic structure • Frame REQUEST is evoked by discontinuous target word „fordert auf“ (ask, request) • Frame elements (roles) are connected to constituents • Flat semantic trees (depth 1) • Independent frames SPD asks coalition to talk about reform Encoded in TIGER/SALSA XML, an extension of TIGER XML: modular description of syntax and (frame) semantics
Working with Frames SALSA: Saarbrücken Lexical Semantics Acquisition Project – Project team, objectives and background information Annotating a German Corpus with Frame. Net Frames … Ø Cross-language application: using English Frame. Net for German – Corpus-based approach • Coverage problems • Special phenomena: non-compositionality and vagueness – Consistency control – From Corpus to Lexicon … and using them in NLP applications – Automatic Frame and Role Assignment (Erk and Pado 2006) – Frame Semantics for Textual Entailment (Burchardt and Frank, 2006) Conclusions and Outlook
Using English Frame. Net Frames for German SALSA frames – stay as close as possible to the Berkeley Frame. Net database – Cross-lingual divergences: adaptation of FN frames Cross-lingual divergences – missing FEs – differences in lexical realisation patterns
Missing FEs Taking: An Agent removes a Theme from a Source such that it is in the Agent’s possession. (Source: either location or former possessor) (2) Er nahm [dem Mann]? das Bier aus der Hand. He took the man the beer from the hand “He took the beer from the man’s hand” Ø Adding lacking FEs to frames (here: Possessor)
Differences in lexical realisation patterns (Rare) cases in which German verbs run counter to frame distinctions made on English data – German “fahren” encompasses English “to drive” and “to ride” s 20937: In 14 Armeefahrzeugen fuhren sie von dem abgezäunten Gelände, das der Besatzungsmacht 28 Jahre lang als Hauptquartier gedient hatte “With 14 army vehicles they drove/departed from the enclosed area which had served the occupying forces for more than 28 years. ” s 27678: Und die Inhaber von Jahresnetzkarten fahren künftig sogar billiger. “Holders of annual-season tickets will ride even cheaper in the future. ” Ø FN has introduced the frame Use_vehicle which subsumes Operate_vehicle and Ride_vehicle
Working with Frames SALSA: Saarbrücken Lexical Semantics Acquisition Project – Project team, objectives and overview Annotating a German Corpus with Frame. Net Frames … – Cross-language application: using English Frame. Net for German – Corpus-based approach Ø Coverage problems • Special phenomena: non-compositionality and vagueness – Consistency control – From Corpus to Lexicon … and using them in NLP applications – Automatic Frame and Role Assignment (Erk and Pado 2006) – Frame Semantics for Textual Entailment (Burchardt and Frank, 2006) Conclusions and Outlook
Coverage Problems SALSA: corpus-based approach – For each predicate: Annotation of all instances in TIGER corpus Frame. Net: lexicographic approach – Defining frames as a sense inventory for describing word meaning – Proceeding by frames, not by predicate Handling gaps in Frame. Net – Frame. Net does not (yet) cover the complete “conceptual space” Predicates encountered in corpus may have missing senses • behandeln 1 (treat an illness) => Frame CURE • behandeln 2 (treat with kindness) => No frame available – Frame. Net does not consider multiword expressions or figurative senses Ø Construction of German Proto-frames („Unknowns“)
„Unknown“ Frames Construction of proto-frames (“Unknowns”) – Inspection of first 20 corpus instances – Identify and group readings not covered by existing Frame. Net frames Proto-framing – Textual definition for frames and roles • Often contrastive to existing frames • Applying Frame. Net „framing principles“ – Differences wrt. Frame. Net frames • Lemma-specific: No evidence from groups of predicates • Covers only senses found in TIGER – Proto-frames of similar predicates are often related („is identical to“)
Example: „rechnen“ (I) Categorisation (Frame. Net) („rank/range among, count as“) – A Cognizer construes an Item as belonging to a certain Category. – Hat [Cognizer man] [Item sie] [Category zur alten Elite] gerechnet? „Did one range her among the old elite? “ Unknown 1 (SALSA) („range among, count as“) – An Item is an example or a member of a particular Category. In contrast to Categorisation, there is no Cognizer involved. In contrast to Membership, the Category does not have to be a social organisation. – [Item Die Philippinen und Chile] rechnen [Cat zu den armen Ländern der Region]. „The Philippines and Chile range among the poor countries of the region“
Example: „rechnen“ (II) Expectation (Frame. Net) („expect“) – Words in this frame have to do with a Cognizer believing that some Phenomenon will take place in the future. [Cog Das Geldinstitut] rechnet [Phen mit einem Angebotsüberhang]. „The institute reckons with a back-lock of offers“ Unknown 2 (SALSA) („count on“) – An Event or State will happen in the foreseeable future. In contrast to Expectation, the actual factivity of the Event is stressed. Is Identical To: Unknown 1 of Frames: bevorstehen. v [Event Womit] hätte [Exp man] rechnen müssen? „What would one have had to count on? “ Unknown 3 (SALSA) („pay off“) – A state of affairs or entity (Theme) creates or increases profit for a beneficiary. [Thm Das Steigen der Grundstueckspreise] rechnet sich auf jeden Fall. „The increase of land prices pays off“
Some Figures Sample: annotation of 476 German predicates – – – 18500 annotated instances 252 annotated with FN frames 373 annotated with proto-frames Avg. 2. 8 frames/pred (2 FN frames + 0. 8 proto-frames) Avg. 43 sentences per FN frame Avg. 17 sentences per proto-frame Most polysemous predicate: “kommen” – 39 frames (FN and proto), includes MWEs Predicate with most missing senses: “bringen” – 15 proto-frames
Working with Frames SALSA: Saarbrücken Lexical Semantics Acquisition Project – Project team, objectives and overview Annotating a German Corpus with Frame. Net Frames … – Cross-language application: using English Frame. Net for German – Corpus-based approach • Coverage problems Ø Special phenomena: non-compositionality and vagueness – Consistency control – From Corpus to Lexicon … and using them in NLP applications – Automatic Frame and Role Assignment (Erk and Pado 2006) – Frame Semantics for Textual Entailment (Burchardt and Frank, 2006) Conclusions and Outlook
Non-compositional Phenomena Support verb constructions – Der Professor hält eine Vorlesung The professor is holding a lecture “The professor is giving a lecture” Metaphors – Der Chef kocht The boss is cooking “The boss is in a rage” Idioms – Die unannotierten Sätze gehen zur Neige The unannotated sentences are going towards decline “The unannotated sentences are running out”
Some Figures Standard readings Sample of 246 Lemmas Number % Sub-corpus nehmen Number % 4638 85, 7% 42 17, 4% Metaphor 369 6, 8% 38 15, 8% Support 326 6, 0% 132 54, 8% 79 1, 5% 29 12, 0% 774 14, 3% 199 82, 6% 5412 100, 0% 241 100, 0% Idiom Non-literal use Total
Support Verb Constructions Annotation – The semantic head is a noun or adjective supported by the governing verb („take a bath“, „perform an operation“) – The verb is tagged with a pseudo-frame „Support“ with frame element „Supported“ The current prime (minister) can take in claim to have. . . „the current prime minister can claim to have. . . “
Idioms Classification Criteria – non-compositional: meaning composition not transparent – the meaning is introduced by the whole construction (modulo variability) Annotation – Tagging multi-word expression as complex FEE (frame-evoking predicate) Nachteile in Kauf nehmen disadvantages in purchase take „accept disadvantages“
Metaphors Classification criteria – non-literal (“figurative”) meaning – semi-compositional: recoverable (literal meaning + mapping from literal to non-literal meaning) • Subjective – Example: „Many think that Perot would walk into a wall on Capitol. “ Annotation – Annotation of source (literal) and target (metaphorical) meaning with “flags”: source / target – Determining the target frame is often difficult • „For some, this goes too far“ – In these cases, only the source frame is annotated, with a metaphor flag „source“ for recovery
Annotation of Metaphors (transparent) – Source frame evoked by the verb – Target frame projected from MWE (verb + syntactic argument) Source Target The sound of their Bigband is a jewel which one can safely take under a strong magnifying glass „The sound of their Bigband is a jewel which stands up to any scrutiny. “
Metaphor: Transfer Scheme Ein Juwel das man unter die starke Lupe nehmen kann A jewel which one can take under a strong magnifying glass FEE Frame Roles nehmen PLACING AGENT [1] man THEME [2] ein Juwel GOAL [3] ([4] starke) Lupe FEE Frame Roles nehmen • [3]/[4] SCRUTINY COGNIZER [1] man PHENOMENON [2] ein Juwel DEGREE [4] starke
Vagueness and Ambiguity Often, it is not possible to make a safe choice among a set of possible semantic interpretations – In frame assignment – In the assignment of semantic roles Different sources of ambiguity and vagueness – Available context does not allow resolution of an ambiguity – More than one interpretation may apply at the same time – The distinction between two readings may be systematically unclear Cases are hard to distinguish (Kilgarriff and Rosenzweig, 2000)
Vagueness and Ambiguity Examples – Gleichwohl versuchen offenbar Assekuranzen, das Gesetz zu umgehen, indem sie von Nichtdeutschen [mehr Geld] verlangen “… by claiming more money from non-Germans” • REQUEST and/or COMMERCE frame? – Die nachhaltigste Korrektur der Programmatik fordert [ein Antrag] “A motion requests the most sustainable correction of the political objectives” • Motion may be SPEAKER or MEDIUM Underspecification – Annotators may assign a set of frames or frame elements marked as “underspecified” (blue) – Special markup in TIGER-SALSA-XML Foreign investors in India again welcome
Working with Frames SALSA: Saarbrücken Lexical Semantics Acquisition Project – Project team, objectives and overview Annotating a German Corpus with Frame. Net Frames … – Cross-language application: using English Frame. Net for German – Corpus-based approach • Coverage problems • Special phenomena: non-compositionality and vagueness Ø Consistency control – From Corpus to Lexicon … and using them in NLP applications – Automatic Frame and Role Assignment (Erk and Pado, 2006) – Frame Semantics for Textual Entailment (Burchardt and Frank, 2006) Conclusions and Outlook
The „four eye“ Principle Subcorpus preparation Extracting all TIGER sentences for a given lemma Identifying suitable Frame. Net frames Construction of „Unknowns“ Annotator 1 Annotator 2 Merging for double adjudication Adjudicator 1 Conflict resolution Adjudicator 2 Conflict resolution Detection and resolution of „major“ annotation errors and conflicts Merging for meta adjudication Detection and discussion of „difficult“ annotation problems „DONE“ Entire process supported by SALTO annotation tool
Inter-annotator and -adjudicator Agreement (Frames) Agreement (FEs) Inter-Annotator 84. 9% 85. 7% Inter-Adjudicator 97. 0% 96. 2% Adjudication can resolve annotation differences fairly reliably – Reduction of disagreements from 15% to 3 -4% – Present strategy: “Four-eye adjudcation” Remaining disagreements are mostly “real problems” – Constructional problems • Complex markables (Ellipses), ambiguities (e. g. ambiguous pronouns) – Conceptual differences • Difficult role distinctions, Level of abstraction, Uncertain inferences about roles
Working with Frames SALSA: Saarbrücken Lexical Semantics Acquisition Project – Project team, objectives and overview Annotating a German Corpus with Frame. Net Frames … – Cross-language application: using English Frame. Net for German – Corpus-based approach • Coverage problems • Special phenomena: non-compositionality and vagueness – Consistency control – From Corpus to Lexicon … and using them in NLP applications – Automatic Frame and Role Assignment (Erk and Pado 2006) – Frame Semantics for Textual Entailment (Burchardt and Frank, 2006) Conclusions and Outlook
A Description Logics based Lexicon Model Work by Dennis Spohr, IMS Stuttgart and SALSA (Spohr et al, 2006) Purposes – Querying XML annotations involving intersecting hierarchies – Consistency checking – Lexicon building: Abstraction of lexicon data from annotation instances DL-based modelling of Frame. Net data – OWL DL • Monotonicity, decidability (Baader et al. 2003) • Reasoning and consistency checking services – Formalisation of definitional part of Frame. Net and corpus annotations – Focus on: • Flexible ways for abstraction and normalisation of data • Consistency checking • Storage and querying architecture (SESAME and Se. RQL)
A Description Logics based Lexicon Model T-Box Linguistic Model Annotation Model § Frame. Net – Frames, Frame Relations – Roles § Sense Assignment – Lemma – Frame § Role Assignment – Syntactic units – Roles § Annotation Types – Frames: single, elliptic, metaphoric, USP – Roles: Single, USP – Target: Single, Multi-Word § Sentences § Syntactic units • Normalisation • Querying • Consistency checking A-Box Corpus: Annotation instances • Sentences • Syntactic units • Frame and role annotations
T-Box vs. A-Box T-Box: General classes for frames (and relations) A-Box: Specific frames (CURE) and corpus annotations Query properties of individual frames (which roles, etc. ) T-Box: General and Specific frames T-Box: General and specific frame classes A-Box: corpus annotations Consistency checking
Querying Retrieving information from the Corpus/Lexicon – Queries specify paths through the model graph – Allow querying of intersecting hierarchies Example: Extract all lemmas that evoke the PLACING frame – Se. RQL query – Retrieved information (with grouping for frequency information)
Normalisation of linguistic information at different levels – TIGER syntactic categories and edge labels – Normalised syntactic categories and grammatical functions • Noun. P, Prep. P, Sent, …. Subj, Obj, Pobj, … Example: syntactic realisation of semantic roles – Specific categories: 2. 176 realisation patterns – Normalised categories: 1. 026 realisation patterns
First Data Release SALSA Corpus – Scheduled for 2006 – > 500 German verbal predicates (of all frequency bands) – total size of about 20. 000 annotated instances and Lexicon with querying interfaces
Working with Frames SALSA: Saarbrücken Lexical Semantics Acquisition Project – Project team, objectives and overview Annotating a German Corpus with Frame. Net Frames … – Cross-language application: using English Frame. Net for German – Corpus-based approach • Coverage problems • Special phenomena: non-compositionality and vagueness – Consistency control – From Corpus to Lexicon … and using them in NLP applications – Automatic Frame and Role Assignment (Erk and Pado 2006) Ø Frame Semantics for Textual Entailment (Burchardt and Frank, 2006) Conclusions and Outlook
Frame Semantics for Textual Entailment (Recognizing) Textual Entailement (RTE): Testing a system‘s capacity to recognize „Textual Entailment“ Sunday‘s earthquake was felt in the southern Indian city of text Madras on the mainland, as well as other parts of south India. The city of Madras is located inhypothesis Southern India. Entailed? TASK: Entailed? – Yes „Realistic“, open-domain data set drawn from system outputs in NLP applications: IR, IE, QA, SUM Controlled set-up: balanced training and test sets 800/800 text-hypothesis pairs
Textual Entailment „We say that T entails H if the meaning of H can be inferred from the meaning of T, as would typically be interpreted by people. This somewhat informal definition is based on (and assumes) common human understanding of language as well as common background knowledge. “ (Dagan, Glickmann, Magnini, RTE 2005 Workshop Proceedings)
The data Fine-grained linguistic analysis T: Oscar-winning actor Nicolas Cage‘s new son and Superman have sth. in common. . . H: Nicolas Cage‘s new son was awarded an Oscar. — No (IE) Lexical semantics and paraphrases (nominalisation, synonymy) T: [o]n December 10 th 1936 King Edward VIII gave up his right to the British throne. H: King Edward VIII abdicated on the 10 th of December, 1936. — Yes (QA) Modality T: U. S. Secretary of State Condoleezza Rice said Thursday that North Korea should return to nuclear disarmament talks and. . . H: North Korea says it will rejoin nuclear talks. . Inference and World Knowledge — No (SUM)
Approximating Textual Entailment Fine-grained LFG-based syntactic analysis – English LFG grammar (Riezler et al. 2002) broad-coverage with high-quality probabilistic disambiguation Frame Semantics – Coarse-grained lexical-semantic classification of predicates with rolebased argument structure encoding – Extended semantic representations: Word. Net senses, SUMO concepts Computing structural and semantic overlap – Hypothesis: high/low ratio of H/T overlap => entailment: yes/no H/T matching for TE text hypothesis match graph size hypothesis graph size
Approximating Textual Entailment Fine-grained LFG-based syntactic analysis – English LFG grammar (Riezler et al. 2002) broad-coverage with high-quality probabilistic disambiguation Frame Semantics – Coarse-grained lexical-semantic classification of predicates with rolebased argument structure encoding – Extended semantic representations: Word. Net senses, SUMO concepts Computing structural and semantic overlap – A learning problem: measures of overlap, weighted entailment decision H/T matching for TE text hypothesis match graph size hypothesis graph size
The SALSA RTE System Linguistic analysis components and Integration XLE parsing: LFG f-structure Fred/Detour + Rosy: frames & roles Word. Net-based WSD: Word. Net & SUMO f-structure w/ (extended) framesemantic projection Recognizing Textual Entailment: Graph matching & Statistical approximation text hypothesis f-structure w/ frames & concepts text-hypothesis-match graph • matching nodes and edges • different match types (similarity types) • extensions for deeper modelling (modality, lexical entailment) Feature extraction Using XLE term rewriting system (Crouch 2005) Model training & classification
Frame and Role Assignment Shalmaneser (Erk & Pado, 2006) – Shallow semantic parser for Frame. Net frame and role assignment – Fred: statistical frame assignment • WSD system for predicates, in terms of frames – Rosy: semantic role assignment • Argument recognition and argument labelling • Using state-of-the-art features from robust syntactic parsing Detour (to Frame. Net via Word. Net) (Burchardt et al. , 2005) – Aim: overcome lexical gaps in Frame. Net – A rule-based frame assignment system that takes a “detour to Frame. Net via Word. Net” § Determine similarity of “unknown LUs” to existing frames (their LUs) based on Word. Net-similarity measures
Frame and Role Assignment Fred & Rosy (Shalmaneser) Fred, Detour & Rosy
Extended semantics projection Porting frame and role assignments to LFG f-structure – Defining a frame semantics projection using head lemmata as interface layer (accounts for parser discrepancies) – Using XLE rewrite system (Crouch 2005) Head-indexed frame & role assignments
Extended semantics projection Rule-based extensions of LFG-frame structures – Frames corresponding to LFG NE classes (location, date, companies, …) – Extra-thematic roles, based on LFG adjunct classes (time, reason, location, etc. ) • +adjunct(Z, Y), ntype_sem(Y, time) ==> s: : (Z, Sem. Z), s: : (Y, Sem. Y), time(Sem. Z, Sem. Y). Extended semantics projection: Word. Net and SUMO classes – WSD: Banerjee & Pedersen, 2003 – Word. Net – SUMO/MILO mapping: Niles and Pease (2001)
A walk-through-example from RTE 2006 Pair 716 Text In 1983, Aki Kaurismäki directed his own first full-time feature. Hypothesis Aki Kaurismäki directed a film.
LFG F-Structures
Automatic Frame Annotation for Text Fred & Rosy frames & roles (statistical) Collins Parse Detour System frames (via Word. Net)
Automatic Frame Annotation for Hypothesis 716_h: Aki Karusmäki directed a film.
LFG and Frames for Hypothesis Rule-based (LFG-NER) Aki Kaurismäki directed a film.
The SALSA RTE System Linguistic analysis components and Integration XLE parsing: LFG f-structure Fred/Detour + Rosy: frames & roles Word. Net-based WSD: Word. Net & SUMO f-structure w/ (extended) framesemantic projection Recognizing Textual Entailment: Graph matching & Statistical approximation text hypothesis f-structure w/ frames & concepts text-hypothesis-match graph • matching nodes and edges • different match types (similarity types) • extensions for deeper modelling (modality, lexical entailment) Feature extraction Model training & classification
Hypothesis-Text-Match Graphs Computing structural and semantic overlap – Computing a “match graph” from text and hypothesis graphs – Different aspects of similarity: • Syntactic: f-structure (PRED, grammatical functions, functional attributes) • Semantic: extended frame structures (frames, roles, Word. Net, SUMO) – Different degrees of similarity: • Strict similarity: Identical syntactic and semantic nodes and edges • Weak similarity: WN-/ FN-relatedness for non-identical PREDs and frames Match graph consists of partial syntactic & semantic graphs Approximating textual entailment – High/low overlap ratio of hypothesis and match graph => entailment: yes/no H/T matching for TE text hypothesis match graph size hypothesis graph size
t: In 1983, Aki Kaurismäki directed his own first fulltime feature. Grammatically related h: Aki Kaurismäki directed a film. Word. Net related
Extensions: Modality Detecting indicators of inconsistent modality types – T: A pet must have rabies protection confirmed by a blood test. H: A case of rabies was confirmed. Marking modal contexts in text and hypothesis – 5 modality types: conditional, future, diamond, box, negation Handling inconsistent modality types in matching process – Introducing negatively marked match nodes – Blocking embedded structures for similarity-based matches – Thus, reducing the size of the match graph
Extensions: Lexical Entailments Bridging partial non-matching text and hypothesis pairs – T: Olson, 62, previously worked as a partner at Ernst & Young LLP, as a Minnesota bank president and as a congressional aide, before joining the Fed board in 2001, to serve a term ending in 2010. H: Olsen is a member of the Fed board. Lexically induced inferences, defined as rewrite rules on h/t/m graphs t: (X 1) joins X 2 h: (Y 1) member-of Y 2 m: (Z 2, Y 2, X 2) => match_type(heuristic_entailment_match). Similar: non-lexical heuristic inferences – Appositions: prime minister X X is prime minister – Possessive constructions: X’s Y the Y of X
Similarity/Entailment measures lexical text graph hypothesis graph match graph proportional: h/t and m/h ratio lex_id ratio_lexid node_m (pred, coref, pro) edge_syn_m (all, gf, subc) ratio_nodes ratio_edges (lfg_)frames_m (lfg_)roles_m ratio_(lfg_)frames ratio_(lfg_)roles syntactic Semantic strict (lfg_)frames_t (lfg_)roles_t (lfg_)frames_h (lfg_)roles_h weak node_frame. FN/derived_m mode_framerel/detour/wnrel_m node_heuristic_entailment_m node_modal_ctxt_mismatch_m Connectedness other clusters_no, clusters_avg_size fragmentary rte_task clusters_avgsize_rel_h clusters_abssize_rel_h
Machine learning WEKA: Selected learners and models – Model 1 Simple Conjunctive Rule classifier preds_m_relto_h 0. 485294 & frames_m_relto_h 0. 954546 rte_entails = 0 Medium/high threshold on pred/frame matches as criterion for rejection High degree of frame similarity /w medium predicate similarity models entailment – Model 2 Meta-classifier Logit. Boost 1. No. of predicate matches relative to hypothesis 2. No. of frame (Fred, Detour) matches relative to hypothesis 3. No. of roles (Rosy) matches relative to hypothesis 4. Match graph size rel. to hypothesis, incl. syn, sem, ontological info
Results in RTE-II SALSA RTE system results RTE-II all tasks IE IR QA SUM Model 1 59. 0 49. 5 54. 5 72. 5 Model 2 57. 8 48. 5 57. 0 67. 0 – Both models score SUM > IR > QA > IE – Refined model better on QA – simple model better on SUM Overall RTE-II results – Average accuracy: 60% (Median: 59%) Accuracy range (in%) No. of groups 53 - 56 58 - 61 62 - 64 74 -75 7 11 3 2
True negatives Modal context marking seems to be effective – 27% of all true negatives involved modality mismatches, while only 11. 9% of all sentences involve marked modal contexts T: The goal of preserving indigenous culture can hardly be achieved by a handful of researchers and curators at museums of ethnology and folk culture. H: Indigenous folk art is preserved. (233) T: Even today, within the deepest recesses of our mind, lies a primordial fear that will not allow us to enter the sea without thinking about the possibility of being attacked by a shark. H: A shark attacked a human being. (322) Future plans – Extend to lexically induced modality/facticity indicators – Testing for non-monotonicity contexts
False positives Typical cases Semantic dissimilarity – Non-matching predicates within larger match graphs, which are in fact semantically dissimilar Structural distance – Matching nodes within a match graph correspond to far distant nodes in the text graph – compared to neighbouring nodes in the match graph
False positives Unconnected nodes matched with distant nodes in text grap T: Some 420 people have been hanged in Singapore since 1991, mostly for drug trafficking, an Amnesty International 2004 report said. That gives the country of 4. 4 million people the highest execution rate in the world relative to population. H: 4. 4 million people were executed in Singapore. (198) – False positive
False positives Graph matching process – Allows criss-cross matching of nodes in the match graph – Builds growing clusters by finding matching edges text hypothesis Introduce weighted edges that reflect the relative distance of pairs of match nodes in text and hypothesis (path distance)
False positives Graph matching process – Allows criss-cross matching of nodes in the match graph – Builds growing clusters by finding matching edges text hypothesis Introduce weighted edges that reflect the relative distance of pairs of match nodes in text and hypothesis (path distance)
Exeriences we gained. . . Annotation – Semantic annotation is difficult, time-consuming and expensive – Frame Semantics works well cross-linguistically – Complementarity of lexicographically vs. corpus-driven annotation Automation and Application – Training automatic frame assignment systems • Shalmaneser (Erk and Pado, 2006) – Experiments in cross-language projection • Pado and Lapata (2005, 2006) – Using frame semantics in NLP tasks: • Textual Entailment (Burchardt and Frank 2006) • Multi-lingual ontology-based question-answering (Frank et al. 2006)
- Primordial follicle
- Oogenesis diagram
- Literary devices in the lottery
- Annotating design ideas
- What is text annotation
- Annotating a poem
- Text symbols for annotating text
- Lose yourself literary devices
- Tips for annotating
- Low german vs high german
- Hot working and cold working difference
- Differentiate between hot working and cold working
- Smart vs hard working
- Contoh hot working
- Advantages of hot working over cold working
- Reading frame
- The four frames of kindergarten
- Allocation of frames
- Slab is one or more physically contiguous frames.
- Counter argument sentence frames
- Semantica dei frame
- Efficient video classification using fewer frames
- Difference between truss frame and machine
- Instruction text
- Socratic sentence starters
- Dahm frames
- Semantic nets and frames
- Tich miller poet
- Fight club changeover
- Allocation of frames in os
- Sue palmer explanation text
- Moment distribution method
- Method of consistent deformation
- Ru
- Bolman and deal 4 frames
- Mean matter
- Eccentrically braced frames
- Eccentrically braced frames
- Informative description example
- Socratic
- Effect audio
- Information reproduced from memory can be assisted by cues.
- Thinking frames
- Sue palmer writing skeletons
- Parent location
- Browser security model
- Cookies frames and frame busting
- Super jumbo frame
- Sentence frames for evidence
- It was a glorious morning in alabama
- Approximate analysis of statically indeterminate structures
- Pediatric occupational therapy frames of reference
- Subjective frame
- Depth and complexity icons patterns
- Frames and machines statics
- 4 frames of kindergarten
- Travel brochure examples
- Frames brief intervention
- L-frames
- Tich miller wendy cope
- Double bottom ship construction
- Lanugo fetalis
- Lecturas del cuarto domingo de cuaresma ciclo b
- Lutalphase
- Flax collocation
- Habeas corpus
- Parenchim testicular
- Corpus spongi
- Corpus linguistics
- Corpus christi de la isleta
- Texas department of transportation corpus christi
- Corpora
- Basal ganglia
- Chorion