Learning Hierarchical Task Networks by Analyzing Expert Traces
Learning Hierarchical Task Networks by Analyzing Expert Traces Pat Langley Tolga Konik Negin Nejati Institute for the Study of Learning and Expertise Palo Alto, California DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 1
Formulation of the Learning Task Given: · A set of domain operators with known effects · A worked out problem solution that consists of · The goal to be achieved in the problem · A sequence of operator instances that achieves the goal · A related sequence of intermediate problem states Find: A hierarchical task network that · Reproduces the solution to the training problem · Generalizes well to related problems in the domain DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 2
The ICARUS Architecture Perceptual Buffer Conceptual Memory Skill Memory Conceptual Inference Skill Learning Belief Memory Perception Skill Retrieval and Selection Environment Goal/Intention Memory Skill Execution Motor Buffer DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 3
Representing Long-Term Structures ICARUS encodes two forms of general long-term knowledge: · Conceptual clauses: A set of relational inference rules with perceived objects or defined concepts in their antecedents; · Skill clauses: A set of executable skills that specify: · a head that indicates a goal the skill achieves; · a single (typically defined) precondition; · a set of ordered subgoals or actions for achieving the goal. These define a specialized class of hierarchical task networks in a syntax very similar to Nau et al. ’s SHOP 2 formalism. Beliefs, goals, and intentions are instances of these structures. DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 4
Primitive Concepts Nonprimitive Concepts Representing Concepts (Axioms) ((in-rightmost-lane ? self ? clane) : percepts ((self ? self) (segment ? seg) (line ? clane segment ? seg)) : relations ((driving-well-in-segment ? self ? seg ? clane) (last-lane ? clane) (not (lane-to-right ? clane ? anylane))) ) ((driving-well-in-segment ? self ? seg ? lane) : percepts ((self ? self) (segment ? seg) (line ? lane segment ? seg)) : relations ((in-segment ? self ? seg) (in-lane ? self ? lane) (aligned-with-lane-in-segment ? self ? seg ? lane) (centered-in-lane ? self ? seg ? lane) (steering-wheel-straight ? self)) ) ((in-lane ? self ? lane) : percepts ((self ? self segment ? seg) (line ? lane segment ? seg dist ? dist)) : tests ((> ? dist -10) (<= ? dist 0)) ) DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 5
Primitive Skill Clauses Nonprimitive Skill Clauses Representing Skills (Methods) ((in-rightmost-lane ? self ? line) : percepts ((self ? self) (line ? line)) : start ((last-lane ? line)) : subgoals ((driving-well-in-segment ? self ? seg ? line)) ) ((driving-well-in-segment ? self ? seg ? line) : percepts ((segment ? seg) (line ? line) (self ? self)) : start ((steering-wheel-straight ? self)) : subgoals ((in-segment ? self ? seg) (centered-in-lane ? self ? seg ? line) (aligned-with-lane-in-segment ? self ? seg ? line) (steering-wheel-straight ? self)) ) ((in-segment ? self ? endsg) : percepts ((self ? self speed ? speed) (intersection ? int cross ? cross) (segment ? endsg street ? cross angle ? angle)) : start ((in-intersection-for-right-turn ? self ? int)) : actions (( steer 1)) ) DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 6
Hierarchical Structure of Memory ICARUS organizes both concepts and skills in a hierarchical manner. concepts Each concept is defined in terms of other concepts and/or percepts. skills Each skill is defined in terms of other skills, concepts, and percepts. DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 7
Hierarchical Structure of Memory ICARUS interleaves its long-term memories for concepts and skills. concepts skills For example, the skill highlighted here refers directly to the highlighted concepts. DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 8
Basic ICARUS Processes ICARUS matches patterns to recognize concepts and select skills. concepts Concepts are matched bottom up, starting from percepts. skills Skill paths are matched top down, starting from intentions. DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 9
Impasse-Driven Analytical Learning Skill Hierarchy Problem Reactive Execution Effects of Primitive skills Expert’s Primitive Skill Sequence … Initial State ? Goal If Impasse Learned Skills Analytical Learning DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 10
Learning HTNs by Trace Analysis concepts primitive skills DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 11
Learning HTNs by Trace Analysis Skill Chaining concepts primitive skills DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 12
Learning HTNs by Trace Analysis Concept Chaining concepts primitive skills DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 13
Constructing an Explanation clear A unstack BA C B A B C A unstackable BA on B A clear B hand-empty concepts unstack CB putdown C primitive skills unstackable putdownable C B DARPA Integrated Learning C POIROT Project IL Kickoff Meeting June 20 -21, 2006 14
From an Explanation to an HTN clear A unstack BA C B A B C A unstackable BA on B A clear B hand-empty concepts unstack CB putdown C primitive skills unstackable putdownable C B DARPA Integrated Learning C POIROT Project IL Kickoff Meeting June 20 -21, 2006 15
From an Explanation to an HTN clear ? x unstackable ? y ? x on ? y ? x concepts C B A B C A unstack ? y ? x clear ? y unstackable ? z ? y C hand-empty unstack ? z ? y putdownable ? z putdown ? z primitive skills DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 16
Key Ideas of the Approach Constrained form of hierarchical task networks · Each skill clause/method has a goal as its head · Each method has one (possibly defined) precondition · The resulting semi-lattice makes learning tractable Learning involves analyzing the expert trace · Explanation draws on a form of goal regression · Each step in the explanation becomes an HTN method · Similar to explanation-based learning for planning but retains the explanation structure DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 17
Related Research Nonincremental, knowledge-lean approaches · Behavioral cloning (Sammut, 1996; Urbancic & Bratko 1994) · Relational induction from traces (e. g. , Reddy & Tadepalli, 1997) Incremental, knowledge-intensive approaches · Explanation-based learning (e. g. , Shavlik, 1989; Mooney, 1990) · Derivational analogy (e. g. , Veloso & Carbonell, 1993) · Programming by demonstration (e. g. , Lau et al. , 2003) DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 18
Plans for Future Research · Extend framework to use and learn partial-order skills · Augment approach to use known subtasks during learning · Extend method to learn skills with negated goals and subgoals · Modify approach to handle partially observable traces · Extend system to learn skills with uncertain outcomes DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 19
End of Presentation DARPA Integrated Learning POIROT Project IL Kickoff Meeting June 20 -21, 2006 20
- Slides: 20