Planning Graph Based Reachability Heuristics Daniel Bryce Subbarao

Planning Graph Based Reachability Heuristics Daniel Bryce & Subbarao Kambhampati IJCAI’ 07 Tutorial 12 January 8, 2007 http: //rakaposhi. eas. asu. edu/pg-tutorial/ dan. bryce@asu. edu rao@asu. edu, January 18, 2007 See Also http: //verde. eas. asu. edu http: //rakaposhi. eas. asu. edu IJCAI'07 Tutorial T 12 Issue of AI Magazine Our Tutorial Article in the Winter

What is Planning? § Wikipedia says: Automated planning and scheduling is a branch of artificial intelligence that concerns the realization of strategies or action sequences, typically for execution by intelligent agents, autonomous robots and unmanned vehicles. Unlike classical control and classification problems, the solutions are complex, unknown and have to be discovered and optimized in multidimensional space. January 18, 2007 IJCAI'07 Tutorial T 12 2

Why Automated Planning? § Manual Planning is slow and error prone § Autonomous agents must plan for themselves § Lots of Applications § Manufacturing: Xerox’s next generation copiers, supply chain management § Space: Mars Exploration Rovers, Deep Space One § Entertainment: non-player characters, narratives § Areas of CS: Web service composition, Database Query planning § Security: User (hacker) modeling § Defense: UAV Reconnaissance, Logistics, Tactical missions January 18, 2007 IJCAI'07 Tutorial T 12 3

Classical Planning Problem § (P, A, I, G) § P: set of propositions § A: set of actions § pre(a) µ P: precondition propositions § eff+(a) µ P: positive effects § eff-(a) µ P: negative effects § Iµ P: Initial State § Gµ P: Goal propositions § Semantics § § appl(a, s) : pre(a) µ s s’ = succ(a, s) = s[ eff+(a) eff-(a) s’ = succ({a 1, a 2, …, an}, s) = succ(an, …succ(a 2, succ(a 1, s)) …) If succ({a 1, a 2, …, an}, I) µ G, then {a 1, a 2, …, an} is a correct plan January 18, 2007 IJCAI'07 Tutorial T 12 4

Why is Planning Hard? § Subgoal Interactions make planning hard § The effect of an action can help or hinder the plan: § § Enable later actions Support a goal Undo the effect of previous actions Prevent the execution of subsequent actions § Plan Synthesis must find a sequence of actions that is executable and achieves the goals § Classical Planning Problems: § Finite Length: NP-Hard § Indefinite Length: PSPACE-Complete § Beyond Classical Planning it is more difficult: § Non-deterministic Planning: EXP-Complete to 2 EXP-Complete § Probabilistic Planning: NPPP-Complete to Undecidable January 18, 2007 IJCAI'07 Tutorial T 12 5

Scalability of Planning Problem is Search Control!!! § Before, planning algorithms could synthesize about 6 – 10 action plans in minutes § Significant scaleup in the last 6 -7 years Realistic encodings of Munich airport! § Now, we can synthesize 100 action plans in seconds. The primary revolution in planning in the recent years has been domain -independent heuristics to scale up plan synthesis January 18, 2007 IJCAI'07 Tutorial T 12 6

Motivation § Ways to improve Planner Scalability § Problem Formulation § Search Space § Reachability Heuristics § Domain (Formulation) Independent § Work for many search spaces § Flexible – work with most domain features § Overall complement other scalability techniques § Effective!! January 18, 2007 IJCAI'07 Tutorial T 12 7

Topics § § § § Classical Planning Rao Cost Based Planning Partial Satisfaction Planning <Break> Non-Deterministic/Probabilistic Planning Resources (Continuous Quantities) Temporal Planning Rao Wrap-up January 18, 2007 IJCAI'07 Tutorial T 12 Dan 8

Classical Planning January 18, 2007 IJCAI'07 Tutorial T 12 9

Rover Domain January 18, 2007 IJCAI'07 Tutorial T 12 10

Classical Planning § Relaxed Reachability Analysis § Types of Heuristics § Level-based § Relaxed Plans § Mutexes § Heuristic Search § Progression § Regression § Plan Space § Exploiting Heuristics January 18, 2007 IJCAI'07 Tutorial T 12 11

Planning Graph and Search Tree § Envelope of Progression Tree (Relaxed Progression) § Proposition lists: Union of states at kth level § Lowerbound reachability January 18, 2007 information IJCAI'07 Tutorial T 12 12

Level Based Heuristics § The distance of a proposition is the index of the first proposition layer in which it appears § Proposition distance changes when we propagate cost functions – described later § What is the distance of a Set of propositions? ? § Set-Level: Index of first proposition layer where all goal propositions appear § Admissible § Gets better with mutexes, otherwise same as max § Max: Maximum distance proposition § Sum: Summation of proposition distances January 18, 2007 IJCAI'07 Tutorial T 12 13

Example of Level Based Heuristics set-level(s. I, G) = 3 max(s. I, G) = max(2, 3, 3) = 3 sum(s. I, G) = 2 + 3 = 8 January 18, 2007 IJCAI'07 Tutorial T 12 14

Distance of a Set of Literals Admissible h(S) = p S lev({p}) Sum Partition-k § § Adjusted Sum h(S) = lev(S) Set-Level Combo Set-Level with memos lev(p) : index of the first level at which p comes into the planning graph lev(S): index of the first level where all props in S appear non-mutexed. § If there is no such level, then If the graph is grown to level off, then ¥ Else k+1 (k is the current length of the graph) January 18, 2007 IJCAI'07 Tutorial T 12 15

How do Level-Based Heuristics Break? P 0 q A 0 B 1 B 2 B 3 B 99 B 100 P 1 q p 1 p 2 p 3 p 99 p 100 A 1 B 2 B 3 B 99 B 100 A P 2 q p 1 p 2 p 3 p 99 p 100 g The goal g is reached at level 2, but requires 101 actions to support it. January 18, 2007 IJCAI'07 Tutorial T 12 16

Relaxed Plan Heuristics § When Level does not reflect distance well, we can find a relaxed plan. § A relaxed plan is subgraph of the planning graph, where: § Every goal proposition is in the relaxed plan at the last level § Every proposition in the relaxed plan has a supporting action in the relaxed plan § Every action in the relaxed plan has its preconditions supported. § Relaxed Plans are not admissible, but are generally effective. § Finding the optimal relaxed plan is NP-hard, but finding a greedy one is easy. Later we will see how “greedy” can change. § Later, in over-subscription planning, we’ll see that the effort of finding an optimal relaxed plan is worthwhile January 18, 2007 IJCAI'07 Tutorial T 12 17

Example of Relaxed Plan Heuristic Support Goal Propositions Individually January 18, 2007 Count Actions RP(s. I, G) = 8 IJCAI'07 Tutorial T 12 Identify Goal Propositions 18

Results Relaxed Plan Level-Based January 18, 2007 IJCAI'07 Tutorial T 12 19

Optimizations in Heuristic Computation § Taming Space/Time costs § Bi-level Planning Graph representation § Partial expansion of the PG (stop before level-off) § It is FINE to cut corners when using PG for heuristics (instead of search)!! § Branching factor can still be quite high § Use actions appearing in the PG (complete) § Select actions in lev(S) vs Levels-off (incomplete) § Consider action appearing in RP (incomplete) January 18, 2007 IJCAI'07 Tutorial T 12 20

Adjusting for Negative Interactions § Until now we assumed actions only positively interact. What about negative interactions? § Mutexes help us capture some negative interactions § Types § Actions: Interference/Competing Needs § Propositions: Inconsistent Support § Binary are the most common and practical § |A| + 2|P|-ary will allow us to solve the planning problem with a backtrack-free Graph. Plan search § An action layer may have |A| actions and 2|P| noops § Serial Planning Graph assumes all non-noop actions are mutex January 18, 2007 IJCAI'07 Tutorial T 12 21

Binary Mutexes P 0 A 0 avail(soil, ) sample(soil, ) avail(rock, ) drive( , ) avail(image, ) P 1 avail(soil, ) sample(soil, ) have(soil) only supporter is mutex avail(rock, ) with at( ) only supporter -drive( , ) Inconsistent Support-avail(image, ) drive( , ) at( ) sample needs at( ), drive negates at( ) --Interference-- A 1 have(soil) has a supporter not mutex with a supporter of at( ), P 2 2 ---feasible together at level avail(soil, ) avail(rock, ) avail(image, ) drive( , ) at( ) sample(image, ) have(soil) sample(rock, ) commun(soil) have(image) have(rock) drive( , ) at( ) Set-Level(s. I, {at( ), . have(soil)}) = 2 Max(s. I, {at( ), . have(soil)}) = 1 January 18, 2007 at( ) drive( , ) at( ) IJCAI'07 Tutorialdrive( , T 12 ) comm(soil) at( ) 22

Adjusting the Relaxed Plans [AAAI 2000] § Start with RP heuristic and adjust it to take subgoal interactions into account § Negative interactions in terms of “degree of interaction” § Positive interactions in terms of co-achievement links § Ignore negative interactions when accounting for positive interactions (and vice versa) § It is NP-hard to find a plan (when there are mutexes), and HAdj. Sum 2 M(S) = length(Relaxed. Plan(S)) + even NP-hard to find an optimal relaxed plan. max p, q S (p, q) § It is easier to add a penalty to the heuristic for ignored mutexes January 18, 2007 Where (p, q) = lev({p, q}) – max{lev(p), lev(q)} /*Degree of –ve Interaction */ IJCAI'07 Tutorial T 12 23

Anatomy of a State-space Regression planner Problem: Given a set of subgoals (regressed state) estimate how far they are from the initial state [AAAI 2000; AIPS 2000; AIJ 2002; JAIR 2003] January 18, 2007 IJCAI'07 Tutorial T 12 24

Rover Example in Regression s 4 comm(soil) s 3 comm(soil) s 2 comm(soil) s 1 comm(soil) s. G comm(soil) avail(rock, ) have(rock) avail(image, ) comm(rock) have(image) at( ) have(image) comm(image) at( ) commun(rock) commun(image) sample(rock, ) sample(image, ) Should be ∞, s 4 is inconsistent, how do we improve the heuristic? ? Sum(s. I, s 4) = 0+0+1+1+2 =4 January 18, 2007 Sum(s. I, s 3) = 0+1+2+2 =5 IJCAI'07 Tutorial T 12 25

Alt. Alt Performance Level-based Adjusted RP Logistics Scheduling Problem sets from IPC 2000 January 18, 2007 IJCAI'07 Tutorial T 12 26

Plan Space Search Then it was cruelly Un. POPped The good times return with Re(vived)POP In the beginning it was all POP. January 18, 2007 IJCAI'07 Tutorial T 12 27

POP Algorithm 1. Plan Selection: Select a plan P from the search queue 2. Flaw Selection: Choose a flaw f (open cond or unsafe link) 3. Flaw resolution: If f is an open condition, choose an action S that achieves f If f is an unsafe link, choose promotion or demotion Update P Return NULL if no resolution exist 4. If there is no flaw left, return P 1. Initial plan: g 1 g 2 Sinf S 0 2. Plan refinement (flaw selection and resolution): q 1 S 0 p S 1 oc 2 S 3 g 2 g 1 Sinf g 2 ~p Choice points • Flaw selection (open condition? unsafe link? Non-backtrack choice) • Flaw resolution/Plan Selection (how to select (rank) partial plan? ) January 18, 2007 IJCAI'07 Tutorial T 12 28

PG Heuristics for Partial Order Planning § Distance heuristics to estimate cost of partially ordered plans (and to select flaws) § If we ignore negative interactions, then the set of open conditions can be seen as a regression state § Mutexes used to detect indirect conflicts in partial plans § A step threatens a link if there is a mutex between the link condition and the steps’ effect or precondition § Post disjunctive precedences and use propagation to simplify January 18, 2007 IJCAI'07 Tutorial T 12 29

Regression and Plan Space s 4 comm(soil) s 3 comm(soil) s 2 comm(soil) s 1 comm(soil) s. G comm(soil) avail(rock, ) have(rock) avail(image, ) comm(rock) have(image) at( ) have(image) comm(image) at( ) commun(rock) commun(image) sample(rock, ) sample(image, ) at( ) S 0 avail(soil, ) avail(rock, ) avail(image, ) avail(rock, ) at( ) have(rock) S 3 have(rock) sample(rock, ) S 2 comm(rock) commun(rock) have(image) comm(soil) comm(rock) S 1 comm(image) commun(image) at( ) S 0 avail(soil, ) avail(rock, ) avail(image, ) January 18, 2007 avail(rock, ) at( ) have(rock) S 3 have(rock) sample(rock, ) comm(rock) commun(rock) have(image) avail(image, ) at( ) S 2 S 1 comm(image) S 4 commun(image) have(image) IJCAI'07 Tutorial T 12 sample(image, ) comm(soil) comm(rock) S 1 comm(image) 30

Re. POP’s Performance § Re. POP implemented on top of UCPOP § Dramatically better than any other partial order planner before it § Competitive with Graphplan and Alt § VHPOP carried the torch at ICP 2002 Written in Lisp, runs on Linux, 500 MHz, 250 MB You see, pop, it is possible to Re-use all the old POP work! [IJCAI, 2001] January 18, 2007 IJCAI'07 Tutorial T 12 31

Exploiting Planning Graphs § Restricting Action Choice § Use actions from: § § Last level before level off (complete) Last level before goals (incomplete) First Level of Relaxed Plan (incomplete) – FF’s helpful actions Only action sequences in the relaxed plan (incomplete) – YAHSP § Reducing State Representation § Remove static propositions. A static proposition is only ever true or false in the last proposition layer. January 18, 2007 IJCAI'07 Tutorial T 12 32

Classical Planning Conclusions § Many Heuristics § Set-Level, Max, Sum, Relaxed Plans § Heuristics can be improved by adjustments § Mutexes § Useful for many types of search § Progresssion, Regression, POCL January 18, 2007 IJCAI'07 Tutorial T 12 33

Cost-Based Planning January 18, 2007 IJCAI'07 Tutorial T 12 34

Cost-based Planning § Propagating Cost Functions § Cost-based Heuristics § Generalized Level-based heuristics § Relaxed Plan heuristics January 18, 2007 IJCAI'07 Tutorial T 12 35

Rover Cost Model January 18, 2007 IJCAI'07 Tutorial T 12 36

Cost Propagation P 0 avail(soil, ) avail(rock, ) avail(image, ) A 0 20 sample(soil, ) 10 drive( , ) at( ) 30 drive( , ) P 1 avail(soil, ) avail(rock, ) avail(image, ) at( ) 20 have(soil) Cost Reduces because Of different supporter At a later level January 18, 2007 10 at( ) 30 at( ) IJCAI'07 Tutorial T 12 A 1 20 sample(soil, ) 10 drive( , ) 35 sample(image, ) 35 sample(rock, ) 25 commun(soil) 25 drive( , ) 15 drive( , ) 35 drive( , ) 40 drive( , ) P 2 avail(soil, ) avail(rock, ) avail(image, ) at( ) 20 have(soil) 35 have(image) 35 have(rock) 10 at( ) 25 comm(soil) 25 at( ) 37

Cost Propagation (cont’d) P 2 avail(soil, ) avail(rock, ) avail(image, ) at( ) 20 have(soil) 35 have(image) 35 have(rock) 10 at( ) 25 comm(soil) 25 at( ) January 18, 2007 A 2 20 sample(soil, ) 10 drive( , ) 30 sample(image, ) 35 sample(rock, ) 25 commun(soil) 25 drive( , ) 15 drive( , ) 30 drive( , ) 40 commun(image) 40 commun(rock) P 3 avail(soil, ) avail(rock, ) avail(image, ) at( ) 20 have(soil) 30 have(image) 35 have(rock) 10 at( ) 25 comm(soil) 25 at( ) 40 comm(image) 40 comm(rock) IJCAI'07 Tutorial T 12 1 -lookahead A 3 20 sample(soil, ) 10 drive( , ) 30 sample(image, ) 35 sample(rock, ) 25 commun(soil) 25 drive( , ) 15 drive( , ) 30 drive( , ) 40 drive( , ) 35 commun(image) 40 commun(rock) P 4 avail(soil, ) avail(rock, ) avail(image, ) at( ) 20 have(soil) 30 have(image) 35 have(rock) 10 at( ) 25 comm(soil) 25 at( ) 35 comm(image) 40 comm(rock) 38

Terminating Cost Propagation § Stop when: § goals are reached (no-lookahead) § costs stop changing (∞-lookahead) § k levels after goals are reached (k-lookahead) January 18, 2007 IJCAI'07 Tutorial T 12 39

Guiding Relaxed Plans with Costs Start Extract at last level (goal proposition is cheapest) January 18, 2007 IJCAI'07 Tutorial T 12 40

Cost-Based Planning Conclusions § Cost-Functions: § Remove false assumption that level is correlated with cost § Improve planning with non-uniform cost actions § Are cheap to compute (constant overhead) January 18, 2007 IJCAI'07 Tutorial T 12 41

Partial Satisfaction (Over-Subscription) Planning January 18, 2007 IJCAI'07 Tutorial T 12 42

Partial Satisfaction Planning § Selecting Goal Sets § Estimating goal benefit § Anytime goal set selection § Adjusting for negative interactions between goals January 18, 2007 IJCAI'07 Tutorial T 12 43

Partial Satisfaction (Oversubscription) Planning In many real world planning tasks, the agent often has more goals than it has resources to accomplish. Example: Rover Mission Planning (MER) Need automated support for Over-subscription/Partial Satisfaction Planning Actions have execution costs, goals have utilities, and the objective is to find the plan that has the highest net benefit. January 18, 2007 IJCAI'07 Tutorial T 12 44

Adapting PG heuristics for PSP § Challenges: § Need to propagate costs on the planning graph § The exact set of goals are not clear § Interactions between goals § Obvious approach of considering § all 2 n goal subsets is infeasible Idea: Select a subset of the top level goals upfront § Challenge: Goal interactions § Approach: Estimate the net benefit of each goal in terms of its utility minus the cost of its relaxed plan § Bias the relaxed plan extraction to (re)use the actions already chosen for other goals January 18, 2007 IJCAI'07 Tutorial T 12 45

Goal Set Selection In Rover Problem Found By RP 50 -25 = 25 comm(soil) 60 -40 = 20 comm(rock) 50 -25 = 25 110 -65 = 45 comm(rock) 20 -35 = -15 comm(image) Found By Cost Propagation Found By Biased RP 70 -60 = 20 comm(image) 130 -100 = 30 comm(image) January 18, 2007 IJCAI'07 Tutorial T 12 46

SAPAPS (anytime goal selection) § A* Progression search § g-value: net-benefit of plan so far § h-value: relaxed plan estimate of best goal set § Relaxed plan found for all goals § Iterative goal removal, until net benefit does not increase § Returns plans with increasing g-values. January 18, 2007 IJCAI'07 Tutorial T 12 47

Some Empirical Results for Altps Exact algorithms based on MDPs don’t scale at all [AAAI 2004] January 18, 2007 IJCAI'07 Tutorial T 12 48

Adjusting for Negative Interactions (Alt. Wlt) § Problem: § What if the apriori goal set is not achievable because of negative interactions? § What if greedy algorithm gets bad local optimum? § Solution: § Do not consider mutex goals § Add penalty for goals whose relaxed plan has mutexes. § Use interaction factor to adjust cost, similar to adjusted sum heuristic § maxg 1, g 2 2 G {lev(g 1, g 2) – max(lev(g 1), lev(g 2)) } § Find Best Goal set for each goal January 18, 2007 IJCAI'07 Tutorial T 12 49

Goal Utility Dependencies BAD GOOD Cost: 50 Utility: 0 Cost: 100 Utility: 500 High-res photo utility: 150 Soil sample utility: 100 Both goals: Additional 200 January 18, 2007 High-res photo utility: 150 Low-res photo utility: 100 Both goals: Remove 80 IJCAI'07 Tutorial T 12 50

Dependencies goal interactions exist as two distinct types cost dependencies utility dependencies Goals share actions in the plan trajectory Defined by the plan Goals may interact in the utility they give Explicitly defined by user See talk Thurs. (11 th) in session G 4 (Planning and Scheduling II) Exists in classical planning, but is a larger issue in PSP No need to consider this in classical planning All PSP Planners Alt. Wlt, Optiplan, Sapa. PS January 18, 2007 SPUDS Our planner based on Sapa. PS IJCAI'07 Tutorial T 12 51

PSP Conclusions § Goal Set Selection § Apriori for Regression Search § Anytime for Progression Search § Both types of search use greedy goal insertion/removal to optimize net-benefit of relaxed plans January 18, 2007 IJCAI'07 Tutorial T 12 60

Non-Deterministic Planning January 18, 2007 IJCAI'07 Tutorial T 12 61

Non-Deterministic Planning § § Belief State Distance Multiple Planning Graphs Labelled Uncertainty Graph Implicit Belief states and the CFF heuristic January 18, 2007 IJCAI'07 Tutorial T 12 62

Conformant Rover Problem January 18, 2007 IJCAI'07 Tutorial T 12 63

Search in Belief State Space avail(soil, ) at( ) have(soil) sample(soil, ) avail(soil, ) drive( , ) avail(soil, ) at( ) have(soil) avail(soil, ) at( ) at( ) avail(soil, ) at( ) have(soil) avail(soil, ) at( ) drive( , ) avail(soil, ) at( ) sample(soil, ) drive( , ) avail(soil, ) at( ) sample(soil, ) avail(soil, ) at( ) have(soil) avail(soil, ) at( ) drive(b, ) January 18, 2007 IJCAI'07 Tutorial T 12 64

Belief State Distance BS 1 Compute Classical Planning Distance Measures Assume can reach closest goal state Aggregate State Distances BS 3 30 15 P 1 5 10 P 2 BS 2 20 Max = 15, 20 40 Sum = 20, 20 Union = 17, 20 January 18, 2007 IJCAI'07 Tutorial T 12 [ICAPS 2004] 65

Belief State Distance BS 1 BS 3 15 P 1 5 Estimate Plans for each state pair a 1 a 3 a 4 a 5 a 8 a 44 a 27 a 14 a 22 a 37 a 19 a 50 a 80 a 34 a 11 a 19 a 13 a 19 a 3 a 4 Capture Positive Interaction & Independence a 5 a 13 Union = 17 January 18, 2007 IJCAI'07 Tutorial T 12 [ICAPS 2004] 66

State Distance Aggregations Over-estimates [Bryce et. al, 2005] Max Sum Union Just Right Under-estimates January 18, 2007 IJCAI'07 Tutorial T 12 67

Extract Relaxed Plans from each Multiple Planning Graphs P 0 avail(soil, ) at( ) A 0 3 drive( , ) sample(soil, ) drive( , ) P 1 avail(soil, ) drive( , ) at( ) drive( , ) sample(soil, ) have(soil) commun(soil) avail(soil, ) at( ) h=7 A 1 drive( , ) 3 P 2 avail(soil, ) at( ) drive( , ) sample(soil, ) at( ) drive( , ) at( ) Build A planning Graph For each State in the belief state January 18, 2007 drive( , ) A 2 1 P 3 at( ) comm(soil) have(soil) drive( , ) avail(soil, ) Step-wise union relaxed plans avail(soil, ) at( ) sample(soil, ) drive( , ) IJCAI'07 Tutorial T 12 at( ) have(soil) drive( , ) sample(soil, ) drive( , ) drive( , ) commun(soil) avail(soil, ) at( ) at( ) have(soil) comm(soil) 68

Labelled Uncertainty Graph P 0 A 0 P 1 A 1 P 2 avail(soil, ) Action labels are the Conjunction (intersection) of their Precondition labels A 2 P 3 avail(soil, ) avail(soil, ) avail(soil, ) sample(soil, ) at( ) Æ ( avail(soil, ) Ç avail(soil, ) )Æ : at( ) Æ … at( ) drive( , ) Labels correspond To sets of states, and Are represented as Propositional January 18, formulas 2007 have(soil) sample(soil, ) at( ) sample(soil, ) comm(soil) sample(soil, ) commun(soil) drive( , ) have(soil) comm(soil) at( ) drive( , ) at( ) drive( , are ) the Effect labels Disjunction (union) drive( , ) of. IJCAI'07 supporter labels. T 12 Tutorial drive( , ) drive( , Stop when )goal is Labeled drive( , with ) every state 69 drive( , ) at( )

Labelled Relaxed Plan P 0 A 0 P 1 A 1 P 2 avail(soil, ) A 2 P 3 avail(soil, ) avail(soil, ) avail(soil, ) sample(soil, ) have(soil) sample(soil, ) h=6 at( ) sample(soil, ) drive( , ) at( ) comm(soil) sample(soil, ) commun(soil) drive( , ) have(soil) sample(soil, ) comm(soil) at( ) drive( , ) at( ) Subgoals and Supporters Need not be used for Every. January state where they 18, 2007 are reached (labeled) sample(soil, ) drive( , ) IJCAI'07 Tutorial T 12 drive( , ) drive( , Must Pick )Enough Supporters drive( , )to cover the (sub)goals 70 drive( , ) at( )

Comparison of Planning Graph Types [JAIR, 2006] LUG saves Time Sampling more worlds, Increases cost of MG Sampling more worlds, Improves effectiveness of LUG January 18, 2007 IJCAI'07 Tutorial T 12 71

State Agnostic Planning Graphs (SAG) § LUG represents multiple explicit planning graphs § SAG uses LUG to represent a planning graph for every state § The SAG is built once per search episode and we can use it for relaxed plans for every search node, instead of building a LUG at every node § Extract relaxed plans from SAG by ignoring planning graph components not labeled by states in our search node. January 18, 2007 IJCAI'07 Tutorial T 12 72

SAG Build a LUG for all states (union of all belief states) G G 1 3 3 5 1 5 o. G 1 3 5 o 12 o 34 o 56 o. G 1 2 3 4 5 6 o 12 o 23 o 34 o 45 o 56 o 67 G o. G 1 2 3 4 5 6 7 § Ignore irrelevant labels Ø Largest LUG == all LUGs January 18, 2007 IJCAI'07 Tutorial T 12 [ AAAI, 2005] 73

Belief Space Problems Conformant January 18, 2007 Classical Problems Conditional IJCAI'07 Tutorial T 12 74

Non-Deterministic Planning Conclusions § Measure positive interaction and independence between states cotransitioning to the goal via overlap § Labeled planning graphs efficiently measure conformant plan distance § Conformant planning heuristics work for conditional planning without modification January 18, 2007 IJCAI'07 Tutorial T 12 75

Stochastic Planning January 18, 2007 IJCAI'07 Tutorial T 12 76

Stochastic Rover Example January 18, 2007 IJCAI'07 Tutorial T 12 [ICAPS 2006] 77

Search in Probabilistic Belief State Space 0. 04 sample(soil, ) 0. 4 0. 5 0. 1 avail(soil, ) at( ) 0. 36 0. 5 0. 1 avail(soil, ) at( ) have(soil) 0. 36 drive( , ) 0. 5 avail(soil, ) at( ) 0. 1 avail(soil, ) at( ) drive( , ) 0. 4 0. 5 0. 1 January 18, 2007 at( ) avail(soil, ) at( ) have(soil) avail(soil, ) at( ) avail(soil, ) sample(soil, ) 0. 04 0. 36 avail(soil, ) at( ) have(soil) 0. 05 avail(soil, ) at( ) 0. 45 avail(soil, ) at( ) have(soil) at( ) drive( , ) 0. 04 avail(soil, ) 0. 1 avail(soil, ) at( ) IJCAI'07 Tutorial T 12 78

Handling Uncertain Actions [ICAPS 2006] § Extending LUG to handle uncertain actions requires label extension that captures: § State uncertainty (as before) § Action outcome uncertainty § Problem: Each action at each level may have a different outcome. The number of uncertain events grows over time – meaning the number of joint outcomes of events grows exponentially with time § Solution: Not all outcomes are important. Sample some of them – keep number of joint outcomes constant. January 18, 2007 IJCAI'07 Tutorial T 12 79

Monte Carlo LUG (Mc. LUG) [ICAPS 2006] § Use Sequential Monte Carlo in the Relaxed Planning Space § Build several deterministic planning graphs by sampling states and action outcomes § Represent set of planning graphs using LUG techniques § Labels are sets of particles § Sample which Action outcomes get labeled with particles § Bias relaxed plan by picking actions labeled with most particles to prefer more probable support January 18, 2007 IJCAI'07 Tutorial T 12 80

Relaxed Conformant Graph. Plan (CGP) P 0 Generate a proposition A 0 layer for each joint Outcome of actions Initial Proposition Layer For Each Possible State avail(soil, ) 0. 4 avail(soil, ) P 1 at( ) drive( , ) sample(soil, ) A 1 drive( , ) at( ) drive( , ) sample(soil, ) have(soil) commun(soil) P 2 avail(soil, ) at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) at( ) avail(soil, ) 0. 5 0. 1 avail(soil, ) at( ) avail(soil, ) at( ) January 18, 2007 at( ) IJCAI'07 Tutorial T 12 at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) 81

CGP-style planning graph Planning Graph is a tree Of Deterministic Planning Graph Branches 0. 4 0. 5 0. 1 January 18, 2007 IJCAI'07 Tutorial T 12 82

Monte Carlo CGP P(G) = 5/16 = 0. 3125 P(G) = 8/16 = 0. 5 P(G) = 13/16 = 0. 8125 0. 4 0. 5 0. 1 January 18, 2007 • Problem: • Have Multiple Planning Graphs, Which can still be costly • Solution: • Use Labeled Planning Graph IJCAI'07 Tutorial T 12 83

Monte Carlo LUG (Mc. LUG) -- Initial Layer N=4 avail(soil, ) at( ) 0. 4 avail(soil, ) 0. 5 0. 1 at( ) avail(soil, ) at( ) avail(soil, ) at( ) Sample a State for Each particle January 18, 2007 IJCAI'07 Tutorial T 12 Form Initial Layer 84

Mc. LUG for rover example P 0 A 0 P 1 A 1 P 2 avail(soil, ) sample(soil, ) have(soil) at( ) drive( , ) at( ) avail(soil, ) sample(soil, ) have(soil) comm(soil) January 18, 2007 sample(soil, ) commun(soil) drive( , ) have(soil) comm(soil) at( ) drive( , ) at( ) Particles in action Label must sample action outcome P 3 avail(soil, ) drive( , ) Sample States for Initial layer -- avail(soil, ) not sampled A 2 avail(soil, ) sample(soil, ) drive( , ) [ICAPS 2006] drive( , ) ¾ of particles ¼ of particles Support goal, need drive( , )Support goal, Okay to stop At least ½ drive( , ) IJCAI'07 Tutorial T 12 85

Monte Carlo LUG (Mc. LUG) P 0 A 0 P 1 A 1 P 2 avail(soil, ) sample(soil, ) have(soil) at( ) drive( , ) at( ) avail(soil, ) sample(soil, ) have(soil) comm(soil) sample(soil, ) commun(soil) drive( , ) have(soil) comm(soil) at( ) drive( , ) at( ) at( ) drive( , ) January 18, 2007 P 3 avail(soil, ) sample(soil, ) drive( , ) A 2 Support drive( , ) Preconditions for Particles the ) drive( , Action supports IJCAI'07 Tutorial T 12 drive( , ) Pick Persistence Must Support By default, drive( , )Goal in All Commun covers Particles Other particlesdrive( , ) 86

Logistics Domain Results P 2 -2 -2 time (s) P 2 -2 -2 length January 18, 2007 P 4 -2 -2 time (s) [ICAPS 2006] P 2 -2 -4 time (s) Scalable, w/P 4 -2 -2 reasonable quality. P 2 -2 -4 length IJCAI'07 Tutorial T 12 87

Grid Domain Results [ICAPS 2006] Grid(0. 8) time(s) Grid(0. 5) time(s) Again, good scalability and quality! Grid(0. 8) length Need More Particles for broad beliefs Grid(0. 5) length January 18, 2007 IJCAI'07 Tutorial T 12 88

Stochastic Planning Conclusions § Number of joint action outcomes too large § Sampling outcomes to represent in labels is much faster than exact representation § SMC gives us a good way to use multiple planning graph for heuristics, and the Mc. LUG helps keep the representation small January 18, 2007 IJCAI'07 Tutorial T 12 89

Planning with Resources January 18, 2007 IJCAI'07 Tutorial T 12 90

Planning with Resources § Propagating Resource Intervals § Relaxed Plans § Handling resource subgoals January 18, 2007 IJCAI'07 Tutorial T 12 91

Rover with power Resource Usage, Same as costs for This example January 18, 2007 IJCAI'07 Tutorial T 12 92

Resource Intervals P 0 A 0 avail(soil, ) P 1 avail(soil, ) avail(rock, ) 5 avail(image, ) recharge [25, 25] power avail(rock, ) avail(image, ) at( ) [-5, 50] power 30 Resource Interval assumes independence have(soil) among Consumers/Producers UB: UB(noop) + recharge = 25 + 25 = 50 at( ) LB: LB(noop) + sample + drive = 25 + (-20) + (-10) = -5 sample(rock, ) commun(soil) IJCAI'07 Tutorial T 12 [-115, 75] power 5 have(soil) drive( , ) at( ) drive( , ) UB: UB(noop) + recharge = 50 + 25 = 75 LB: LB(noop) + drive + sample + commun + drive = -5 + (-30) + (-20) + (-10) + (-25) + (-15) + (-5) = -115 January 18, 2007 avail(soil, ) drive( , ) at( ) drive( , ) P 2 sample(soil, ) avail(rock, ) A 1 have(rock) comm(soil) 93 at( )

Resource Intervals (cont’d) P 2 A 2 avail(soil, ) avail(rock, ) avail(image, ) at( ) [-115, 75] power 5 have(soil) at( ) have(rock) drive( , ) sample(soil, ) drive( , ) recharge sample(rock, ) commun(soil) drive( , ) P 3 avail(soil, ) avail(rock, ) avail(image, ) at( ) [-250, 100] power 5 have(soil) at( ) have(rock) sample(image, ) A 3 drive( , ) sample(soil, ) drive( , ) recharge sample(rock, ) commun(soil) drive( , ) sample(image, ) have(image) comm(soil) drive( , ) avail(rock, ) avail(image, ) at( ) [-390, 125] power have(soil) at( ) have(rock) drive( , ) January 18, 2007 commun(rock) avail(soil, ) have(image) comm(soil) at( ) P 4 comm(soil) drive( , ) at( ) IJCAI'07 Tutorial T 12 comm(rock) commun(image) commun(rock) at( ) comm(image) 94 comm(rock)

Relaxed Plan Extraction with Resources Start Extraction as before Track “Maximum” Resource Requirements For actions chosen at each level May Need more than One Supporter For a resource!! January 18, 2007 IJCAI'07 Tutorial T 12 95

Results January 18, 2007 IJCAI'07 Tutorial T 12 96

Results (cont’d) January 18, 2007 IJCAI'07 Tutorial T 12 97

Planning With Resources Conclusion § Resource Intervals allow us to be optimistic about reachable values § Upper/Lower bounds can get large § Relaxed Plans may require multiple supporters for subgoals § Negative Interactions are much harder to capture January 18, 2007 IJCAI'07 Tutorial T 12 98

Temporal Planning January 18, 2007 IJCAI'07 Tutorial T 12 99

Temporal Planning § Temporal Planning Graph § From Levels to Time Points § Delayed Effects § Estimating Makespan § Relaxed Plan Extraction January 18, 2007 IJCAI'07 Tutorial T 12 100

Rover with Durative Actions January 18, 2007 IJCAI'07 Tutorial T 12 101

Search through time-stamped states § Goal Satisfaction: S=(P, M, , Q, t) G if <pi, ti> G either: § <pi, tj> P, tj < ti and no event in Q deletes pi. § e Q that adds pi at time te < ti. § Action Application: Action A is applicable in S if: § All instantaneous preconditions of A are satisfied by P and M. § A’s effects do not interfere with and Q. § No event in Q interferes with persistent preconditions of A. § A does not lead to concurrent resource change § When A is applied to S: § P is updated according to A’s instantaneous effects. § Persistent preconditions of A are put in § Delayed effects of A are put in Q. January 18, 2007 IJCAI'07 Tutorial T 12 102

Temporal Planning Record First time Point Action/Proposition is First Reachable Assume Latest Start Time for actions in RP January 18, 2007 IJCAI'07 Tutorial T 12 103

SAPA at IPC-2002 Satellite (complex setting) Rover (time setting) January 18, 2007 IJCAI'07 Tutorial T 12 Rover (time setting) [JAIR 2003]104

Temporal Planning Conclusion § Levels become Time Points § Makespan and plan length/cost are different objectives § Set-Level heuristic measures makespan § Relaxed Plans measure makespan and plan cost January 18, 2007 IJCAI'07 Tutorial T 12 105

Overall Conclusions § Relaxed Reachability Analysis § Concentrate strongly on positive interactions and independence by ignoring negative interaction § Estimates improve with more negative interactions § Heuristics can estimate and aggregate costs of goals or find relaxed plans § Propagate numeric information to adjust estimates § Cost, Resources, Probability, Time § Solving hybrid problems is hard § Extra Approximations § Phased Relaxation § Adjustments/Penalties January 18, 2007 IJCAI'07 Tutorial T 12 106

Why do we love PG Heuristics? § They work! § They are “forgiving” § § You don't like doing mutex? okay You don't like growing the graph all the way? okay. § Allow propagation of many types of information § Level, subgoal interaction, time, cost, world support, probability § Support phased relaxation § E. g. Ignore mutexes and resources and bring them back later… § Graph structure supports other synergistic uses § e. g. action selection § Versatility… January 18, 2007 IJCAI'07 Tutorial T 12 107

Versatility of PG Heuristics § PG Variations § § Serial Parallel Temporal Labelled § Planning Problems § Classical § Resource/Temporal § Conformant January 18, 2007 § Propagation Methods § § Level Mutex Cost Label § Planners § § IJCAI'07 Tutorial T 12 Regression Progression Partial Order Graphplan-style 108