Transfer Learning Evaluation Workshop May 2 2006 Formalizing

Evaluating General AI Techniques • Agents are given a declarative description of their environment

Evaluating Transfer Learning • The goal of the Transfer Learning Program is to …

Possible Solution • Keep source and target problems within the same domain – Urban

Concern • How do we compare results across domains? – If a TL system

Formal Model for TL Evaluation • Source-target structural relationships form the basis of any

Modeling Problem Environments • We are concerned with finite, discrete, synchronous, multi-agent environments •

Relational Nets (Informally) • Describe problem environments in terms of – Objects in the

Relational Net Transitions q(c, d) q(a, c) p(a) p(b) p(c) Objects and Relations q(c,

State Graph p(a) p(b) q(c, d) q(a, c) Edges: Nodes: Joint of actions States

Relational Nets (Formally) • A relational net is a structure R, C, T, L

Back to Transfer Learning • We can use mathematical structure of relational nets to

Components • A relation r in RB, a subset {ci} C, and the transitions

Configurations • Elements of L defining initial, terminal, goal, legal • Example (FPS game):

State Graphs • Solutions, sub-solutions, and action sequences can all be expressed as paths,

Defining Transfer Levels (I) • Example (Level 6: Composing) New problem instances consist of

Defining Transfer Levels (II) • Dan Shapiro’s work on refining transfer levels for Urban

UC Transfer Level 6 Sources adversary neutralized flag located pickup Inventory fire (flag) Source

UC Level 6 Target adversary neutralized flag located up j ck pi p um

Benefits of Formalization • Do a given source/target pair form a challenge problem for

Extending Across Evaluation Domains • “Continuous” time and space? • No declarative representation of

Slides: 21

Download presentation

Transfer Learning Evaluation Workshop May 2, 2006 Formalizing Transfer Learning Evaluation with Relational Nets Nathaniel Love, Michael Genesereth (PI), Charles Petrie Stanford University {natlove, genesereth, petrie}@stanford. edu NRL Grant # N 00173 -05 -1 -G 033 FY 06 -FY 08

Evaluating General AI Techniques • Agents are given a declarative description of their environment and their goals • Agents plan, act, and accomplish goals these goals. What should this declarative description look like? How do we communicate it to the agents? How do they use it?

Evaluating Transfer Learning • The goal of the Transfer Learning Program is to … enable computers to apply knowledge learned for a particular, original set of tasks to achieve superior performance on new, previously unseen tasks. What is the relationship between these two sets of tasks? How might a transfer learning system understand how to apply source knowledge to the target?

Possible Solution • Keep source and target problems within the same domain – Urban Combat, Stratagus, Robocup, … • Tasks in target problems are new, but general characteristics of the target problem can be assumed. • Define transfer levels as they make sense for this domain.

Concern • How do we compare results across domains? – If a TL system exhibits level 6 transfer in Robocup, can we expect that it will exhibit level 6 transfer in Urban Combat?

Formal Model for TL Evaluation • Source-target structural relationships form the basis of any transfer level definition. • Provide a mathematical framework formalizing these source-target relationships. • Need a framework that applies – Across domains – Across transfer level definition

Modeling Problem Environments • We are concerned with finite, discrete, synchronous, multi-agent environments • Representations – State machines (too large) – Propositional Nets* (better) – Relational Nets* • Language – GDL, language for representing Relational Nets: http: //logic. stanford. edu/reports/LG-2006 -01. pdf * forthcoming in http: //logic. stanford. edu/reports/

Relational Nets (Informally) • Describe problem environments in terms of – Objects in the world – Relationships among those objects – How those relationships change in response to agent action • Defined in terms of the environment – Agent goals – Terminating conditions – Constraints on agent action

Relational Net Transitions q(c, d) q(a, c) p(a) p(b) p(c) Objects and Relations q(c, d) q(a, c) Update g(c) Views g(d)

State Graph p(a) p(b) q(c, d) q(a, c) Edges: Nodes: Joint of actions States the made by all Relational Net agents p(c) p(d) p(c) q(c, d) q(c, c) q(c, d) q(a, c)

Relational Nets (Formally) • A relational net is a structure R, C, T, L where: – R is a finite set of relations. – C is a finite set of objects. – T is a set of transitions, defining the next state in terms of the current state. – L is a set of views, each defining a concept in terms of the facts of the current state. • R = RB U RI U RV • RV contains: – goal(agent, value) – legal(agent, action) – terminal – initial

Back to Transfer Learning • We can use mathematical structure of relational nets to formalize the definitions of transfer learning levels. • Critical terminology: – Components – Configurations – Solutions / sub-solutions / action sequences

Components • A relation r in RB, a subset {ci} C, and the transitions {tr} in T that define r. • Example (FPS game): – Relation: inventory – Objects: pistol, clip, flashlight, … – Physics: changes in inventory caused by agent actiondrop, pickup, load, …

Configurations • Elements of L defining initial, terminal, goal, legal • Example (FPS game): – Initial state: location(12, 4), inventory(gun), … – Goal for agent: inventory(flag)

State Graphs • Solutions, sub-solutions, and action sequences can all be expressed as paths, abstract paths, or sets of paths in the state graph. • Example action sequence (FPS Game): fire reload ammo: 0 ammo: 1

Defining Transfer Levels (I) • Example (Level 6: Composing) New problem instances consist of combinations of components from distinct component sets encountered during training. • We’ve defined components as distinct pieces of relational nets. • Let Cs 1 … Csn be the components appearing in source relational nets {S 1, … Sm}. Then T is a Level 6 target instance iff each component CT in T is one of Cs 1 … Csn.

Defining Transfer Levels (II) • Dan Shapiro’s work on refining transfer levels for Urban Combat scenarios Solutions to target problems require sequential combination of solutions to source problems. • This seems like a new definition for a transfer level, not in the PIP. . . • But we already have the tools to model this-this is a condition on the topology of the state graphs of the source and target

UC Transfer Level 6 Sources adversary neutralized flag located pickup Inventory fire (flag) Source 1 pi flag located up j ck p um climb pickup Inventory fire (flag) br k ea p Source 2 pi u k c

UC Level 6 Target adversary neutralized flag located up j ck pi p um climb pickup Inventory fire (flag) br k ea p Target pi u k c

Benefits of Formalization • Do a given source/target pair form a challenge problem for level k? – TL Evaluators now have a formal answer for this. • How does a TL system know how to transfer in this setting? (Example, level 6) – Examine components of source problems, learn their dynamics and their relationships to goals – The target problem, while completely unknown in advance, is guaranteed to be composed of only these components.

Extending Across Evaluation Domains • “Continuous” time and space? • No declarative representation of game? – Scenarios can be formalized using this framework • TL systems are already using formal, relational representations internally – UCB: relational MDP representation of games for relational reinforcement learning – ISLE: building value functions based on concepts defined in terms of the objects and relations in the game worlds • Similarly, TL Evaluators can gain a unified perspective of our challenge problems across all domains.