Cumulative Learning of Relational and Hierarchical Skills from


















- Slides: 18
Cumulative Learning of Relational and Hierarchical Skills from Problem Solving Pat Langley Institute for the Study of Learning and Expertise Palo Alto, CA http: //www. isle. org This research was funded by Grant HR 0011 -04 -1 -0008 from the DARPA Information Processing Technology Office, which may not agree with the points made in this talk.
Research Objectives We are designing and implementing new learning methods that: · operate over relational, hierarchical knowledge structures · support reasoning, reactive control, and problem solving · are embedded within a broader architectural framework · utilize existing knowledge to increase learning rates · acquire this knowledge in an incremental, cumulative manner · are applicable to a variety of challenging domains We hope to develop learning mechanisms that support horizontal and vertical transfer both within and across domains.
The ICARUS Architecture* Perceptual Buffer Long-Term Conceptual Memory Long-Term Skill Memory Categorization and Inference Means-Ends Analysis Short-Term Conceptual Memory Perception Skill Retrieval Environment Goal/Skill Stack Skill Execution Motor Buffer * without learning
Organization of Long-Term Memory ICARUS organizes both concepts and skills in a hierarchical manner. concepts Each concept is defined in terms of other concepts and/or percepts. skills Each skill is defined in terms of other skills, concepts, and percepts.
Concepts from In-City Driving Domain (in-segment (? self ? sg) : percepts ((self ? self segment ? sg) (segment ? sg))) (aligned-with-lane (? self ? lane) : percepts ((self ? self) (lane-line ? lane angle ? angle)) : positives ((in-lane ? self ? lane)) : tests ((> ? angle 0. 05) (< ? angle 0. 05)) ) (on-street (? self ? packet) : percepts ((self ? self) (packet ? packet street ? street) (segment ? sg street ? street)) : positives ((not-delivered ? packet) (current-segment ? self ? sg))) (increasing-direction (? self) : percepts ((self ? self)) : positives ((increasing ? b 1 ? b 2)) : negatives ((decreasing ? b 3 ? b 4)) )
Organization of Long-Term Memory ICARUS interleaves its long-term memories for concepts and skills. concepts skills For example, the skill highlighted here refers directly to the highlighted concepts.
Skills from In-City Driving Domain (turn-around-on-street (? self ? packet) : percepts ((self ? self segment ? segment direction ? dir) (building ? landmark)) : start ((on-street-wrong-direction ? packet)) : effects ((on-street-right-direction ? packet)) : ordered ((get-in-U-turn-lane ? self) (prepare-for-U-turn ? self) (steer-for-U-turn ? self ? landmark)) ) (get-aligned-in-segment (? self ? sg) : percepts ((lane-line ? lane angle ? angle)) : requires ((in-lane ? self ? lane)) : effects ((aligned-with-lane ? self ? lane)) : actions (( steer ( times ? angle 2))) ) (steer-for-right-turn (? self ? int ? endsg) : percepts ((self ? self speed ? speed) (intersection ? int cross ? cross) (segment ? endsg street ? cross angle ? angle)) : start ((ready-for-right-turn ? self ? int)) : effects ((in-segment ? self ? endsg)) : actions (( times steer 2)) )
Basic ICARUS Processes ICARUS matches patterns to recognize concepts and select skills. concepts Concepts are matched bottom up, starting from percepts. skills Skill paths are matched top down, starting from intentions.
A Trace of Means-Ends Problem Solving An impasse causes ICARUS to invoke a means-ends problem solver. 11 10 9 8 1 7 6 3 5 2 4 The resulting traces provide the material for learning new relational skills and concepts in terms of simpler components.
Learning Skills from Means-Ends Traces 11 10 9 8 1 7 A 6 3 5 2 4 concept chaining ICARUS learns skills for ordering subgoals from concept chaining.
Learning Skills from Means-Ends Traces 11 10 9 7 A 6 3 5 8 B 2 4 skill chaining ICARUS learns skills for ordering subskills from skill chaining. 1
Learning Skills from Means-Ends Traces concept chaining 11 10 C 9 7 A 6 3 5 8 B 1 2 4 Each level of skill learning builds upon results from prior levels.
Learning Skills from Means-Ends Traces skill chaining 11 10 C 9 7 3 5 8 A 6 D B 1 2 4 This leads ICARUS to extend its skill hierarchy in a cumulative way.
Learning Skills from Means-Ends Traces 11 10 C 9 7 3 5 D 8 A 6 E B 1 2 4 concept chaining This in turn supports transfer both within and across problems.
Transfer Results in Free. Cell is a complex solitaire game in which all cards are visible. We let ICARUS practice on versions with a small set of cards, then examined its transfer to problems with more cards.
Transfer Results in Free. Cell Experiments revealed substantial transfer to the harder problems. This held both for the percentage of problems solved and for the effort required on successful attempts.
Directions for Future Research Our initial results suggest ICARUS can transfer knowledge learned on simple problems to complex ones from the same domain. In future work, we intend to examine the additional issues of: · vertical transfer to domains that utilize others as components; · horizontal transfer to domains to share knowledge elements; · horizontal transfer to tasks that require representation mapping. The final problem is a key challenge in developing robust methods for reusing learned knowledge. We hope to evaluate our ideas on both action-oriented domains like strategy games and inferential tasks like physics problems.
The General Game-Playing Testbed Genesereth and Love (2005) have developed a framework that: · · · supports a wide variety of N-person games; describes each game setting in a standard logical formalism; specifies the rules of each game in a related formalism; manages matches between players and records activities; provides sample games for debugging candidate systems. They have designed this framework to encourage research on general approaches to intelligent behavior. However, it also provides an excellent testbed for evaluating the ability of learning systems to transfer within and across domains. See http: //games. stanford. edu for more details and examples.