Feature Extraction with Description Logics Functional Subsumption Rodrigo
Feature Extraction with Description Logics Functional Subsumption Rodrigo de Salvo Braz Dan Roth University of Illinois at Urbana-Champaign
A conflict ● ● ● Most machine learning algorithms use feature vectors as inputs. Most data is best represented as structured data. Feature extraction is the conversion from one to the other (and may be most of the work).
Structured data – I begin . . . before after word(an) tag(DT) before word(Iraqi) tag(JJ) end before after word(intelligence) tag(NN) . . .
Structured data – II
Feature Extraction FE Structured example Human-written feature types 0 1 1 0 feature vector
Feature Extraction Typically done in ad hoc fashion: ● Prevents general analysis; ● Prevents Feature Extraction/Learning unified analysis (e. g. kernels). ● Using a language is tricky ● Type of inference. ● May be intractable if not careful. ●
A language for declaring which features to generate Feature type specifications by directed trees Example segment
Feature types Example Generating feature vectors ? ?
Generating feature vectors Feature types Example 1 3 2 1 3 1 2
Generating feature vectors Example Feature types 2 1 3 1
Generating feature vectors Example Feature types 1 2 3 2 1 3 1
Feature types Example Generating feature vectors Nothing like this in the example! 0
Feature types Example Generating feature vectors 1 2 3 1 3 2 1
Feature types Example Generating feature vectors 1 2, 3 1 3 2 1
Feature types Example Generating feature vectors 1 2, 3 1 3 2 1
Feature types Example Generating feature vectors 1 0 1 1
Feature Description Logics (AND (SOME spouse ANY) (SOME child (AND male tall))) (SOME spouse (SOME friend female))
Subsumption A description C subsumes ( ) a description D if every individual in D must be in C, no matter the interpretation. ●Subsumption is tractable. ● C = (AND (SOME spouse ANY) (SOME child male)) D = (AND (SOME spouse (SOME student ANY)) (SOME child (AND tall male)) (SOME child female)) C D
Feature extraction as subsumption (SOME child female) Feature type Example (AND SOME friend (AND name(carol) SOME child (AND name(kelly) female)) SOME child name(john)) Description of node
Feature extraction as subsumption (SOME child female) Feature type Example name(john) Description of node
Feature extraction as subsumption (SOME child female) Feature type Example (AND name(kelly) female) Description of node
Feature extraction as subsumption (SOME child female) Feature type active feature! Example (AND name(carol) SOME child (AND name(kelly) female)) Description of node
A problem in practice buy subject dentist object car purchase subject dentist name(patricia) object car model(accord) Subsumption would be natural in this case but does not occur
A problem in practice kill subject object name(JFK) kill subject name(castro) object name(kennedy)
A problem in practice name(schwarzenegger) job name(schwarzneger) job actor job governor
Make comparison more flexible At core of subsumption algorithm is the comparison of attributes: ● . . . if (attr 1 == attr 2). . . ● We simply make that a function call: . . . if (f (attr 1, attr 2) == 1). . .
● Is this just a hack? What about the nice DL semantics?
Is this just a hack? What about the nice DL semantics? ●In fact, equivalent to “shallow OR” (tractable). ●
Is this just a hack? What about the nice DL semantics? ●In fact, equivalent to “shallow OR” (tractable). ●Replace any attr by (OR a a. . . a ) 1 2 n where f(attr, ai) = 1. ● (AND kill (SOME object JFK)) (AND (OR kill murder assassinate) (SOME object (OR JFK kennedy “John F. Kennedy”. . . )))
Why not just use shallow OR then? Function is an implicit representation. ●We may incorporate procedural knowledge: ● Typos; ● Similar sounding words; ● Context-sensitive knowledge. ●
Take home message Feature Description Logics provides an expressive way to deal with structured examples. ● Syntax choices render it tractable. ● Allows for FE-learning integrated approaches like kernels (Cumby & Roth 2003). ● Can be made even more expressive with little extra cost by functional subsumption. ●
The End
- Slides: 32