Scalable Planning Under Uncertainty Daniel Bryce dan bryceasu
Scalable Planning Under Uncertainty Daniel Bryce dan. bryce@asu. edu http: //verde. eas. asu. edu April 20, 2007 Arizona State University 4/20/07 Ph. D. Defense -- Bryce
What is Planning? § Find an action sequence to achieve goals § E. g. , Mars Rover Plan to get soil, image, and rock data: (sample soil alpha), (communicate soil), (drive alpha gamma), (sample image gamma), (communicate image), (drive gamma beta), (sample rock beta), (communicate rock) § Formulate as: finding a path in the state transition graph G = (V, E) § V = states, E = actions §Problem: The graph ) is “huge” commun(image) sample(image, at( ) have(image) |P| states drive( , ) at( ) §|P| state variables means V = 2 commun(rock) sample(rock, ) have(rock) at( )of variables §Manyat( ) domains have 100’s have(soil) drive( , ) at( ) sample(soil, ) communicate(soil) at( ) have(soil) 4/20/07 drive( , ) comm(soil) at( ) have(soil) Ph. D. Defense -- Bryce drive( , ) at( ) have(image) comm(image) at( ) have(rock) comm(rock) at( ) have(soil) comm(soil) 2
Why Automated Planning? § Manual Planning is slow and error prone § Autonomous agents must plan for themselves § Several Applications § § § Manufacturing Space Entertainment Security Defense Biology 4/20/07 Ph. D. Defense -- Bryce 3
The Importance of Handling Uncertainty § Plan Failure is costly § $265 Million Mars Pathfinder mission wasted 40 -70% of time § Uncertainty contributing to failure § § § Incomplete Information Imperfect Actions Noisy Sensors Uncertain Duration Actions Model-Environment Mismatch § Handling Failure § Re-plan § Not always possible (limited CPU, too late, suboptimal) § Pre-plan § Conformant or Conditional Plans § Synthesis is Costly (small runtime CPU, preempt failure, better quality) 4/20/07 Ph. D. Defense -- Bryce 4
Background – Planning Under Uncertainty Non-Deterministic (Qualitative) Probabilistic (Quantitative) Full 0. 28 1. 0 0. 42 0. 3 None 1. 0 0. 28 0. 42 1. 0 0. 3 Partial 0. 7 0. 4 0. 6 1. 0 0. 3 4/20/07 Ph. D. Defense -- Bryce 1. 0 5
Search in Belief State Space (Inc. Info) |P|: State Variables 2|P|: States |P| 22 : Belief States avail(soil, ) at( ) have(soil) sample(soil, ) avail(soil, ) drive( , ) avail(soil, ) at( ) have(soil) avail(soil, ) at( ) at( ) avail(soil, ) at( ) sample(soil, ) avail(soil, ) at( ) have(soil) avail(soil, ) at( ) drive( , ) avail(soil, ) at( ) sample(soil, ) avail(soil, ) at( ) have(soil) avail(soil, ) at( ) drive(b, ) 4/20/07 Ph. D. Defense -- Bryce 6
Scalability In Classical Planning § Reachability Heuristics have invigorated planning § Classical Planners find plans with hundreds of actions oqt oqr q q 5 oppr p 5 5 o 56 rp q 5 r 5 p 6 6 oo 56 rq ps orp oqt o 56 o 67 pq opr o 67 State Space Search Ph. D. Defense -- Bryce o 56 p 6 Realistic encodings of Munich airport! 4/20/07 opq r 5 opr 5 q t r p 5 r q 6 q r ps 5 t p 56 q 6 r 7 6 7 oqs otp ors o 67 oqt ops 67 oqt o 78 q t r s pr q 5 p 6 s 7 t p 5 q 6 r 7 s t 6 7 8 Planning Graph Reachability Analysis 7
Planning Graph and Search Tree s 11 s. I at( ) drive( , ) at( ) s 12 sample(image, ) s 22 sample(rock, ) at( ) s 23 sample(soil, ) commun(soil) s 13 at( ) have(soil) § s 21 s 24 drive( , ) s 25 drive( , ) at( ) have(image) at( ) have(rock) at( ) have(soil) comm(soil) at( ) have(soil) Envelope of Progression Tree (Relaxed Progression) § Proposition lists: Union of states at kth level § Lowerbound reachability 4/20/07 information [AI Magazine, Spring 2007] commun(image) commun(rock) s 31 s 32 s 33 commun(soil) at( ) have(image) comm(image) at( ) have(rock) comm(rock) at( ) have(soil) comm(soil) Approximate Logical Inference Ph. D. Defense -- Bryce 8 Tutorial at: ICAPS-06 and IJCAI-07
Synopsis of Dissertation § Can techniques for scaling classical planning adapt to scale planning under uncertainty? § Yes! 4/20/07 Ph. D. Defense -- Bryce 9
Contributions: Heuristic Search [ICAPS-04, JAIR 06] P 0 avail(soil, ) at( ) A 0 P 1 drive( , ) avail(soil, ) at( ) have(soil) at( ) avail(soil, ) drive( , ) at( ) avail(soil, ) at( ) P 2 at( ) drive( , ) sample(soil, ) commun(soil) drive( , ) sample(soil, ) drive( , ) avail(soil, ) at( ) drive( , ) avail(soil, ) drive( , ) at( ) sample(soil, ) at( ) drive( , ) at( ) Ch. 3 P 0 P 3 A 0 P 1 avail(soil, ) at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) avail(soil, ) drive( , ) avail(soil, ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) : s sample(soil, ) : r comm(soil) drive( , ) avail(soil, ) at( ) drive( , ) commun(soil) sample(soil, ) at( ) drive( , ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) have(soil) at( ) drive( , ) at( ) comm(soil) drive( , ) at( ) have(soil) commun(soil) at( ) have(soil) Ch. 4 drive( , ) at( ) 3 5 [ICAPS-06*, AIJ] 4/20/07 drive( , ) drive( , ) Ch. 7 1 5 o G G 1 3 5 o 1 2 o 3 4 o 5 6 o G 1 2 3 4 5 6 o 1 o 2 2 3 o 3 o 4 4 o 5 5 o 6 6 7 1 : s 10 0(B) B 0 20 0 C s 20 0(C) C 0(C) 27 : r 7 0(R) 0 0 : r at( ) 10 0 : s 24 1 0 0 R r 0 R 17 7 0(R) r 0 : r Ch. 5 Cost Sensitive Heuristics Pr(G| 2) = 0. 5 E[Cost( 2)] = 10 1 G o G 0 : s at( ) drive( , ) 0(B) B comm(soil) 0 s 0 10 0 at( ) 0 s have(soil) commun(soil) G 1 3 0. 4 0. 5 0. 1 Monte Carlo in Planning Graphs 0 s Compressing Planning Graphs Planning Graph Estimates of BS Distance Ch. 6 P 3 avail(soil, ) at( ) drive( , ) at( ) A 2 avail(soil, ) sample(soil, ) drive( , ) commun(soil) P 2 [UAI-05] avail(soil, ) sample(soil, ) drive( , ) at( ) A 1 avail(soil, ) drive( , ) avail(soil, ) A 2 drive( , ) sample(soil, ) avail(soil, ) A 1 [AAAI-WS-04, JAIR 06] Pr(G| 3) = 0. 75 E[Cost( 3)] = 30 Pr(G| 4) = 0. 0 E[Cost( 4)] = 0 3 2 Pr(G) 1 Pr(G) 4 2 3 2 4 3 4 E[Cost] a 0. 2 5 E[Cost] 0. 8 Pr(G) ( 1, 3) ( 2, 3) 6 7 State Agnostic Planning Graphs [AAAI-05*] Ch. 8 ( 1, 4) ( 2, 4) E[Cost] Multi-Objective LAO* [In preparation] Ph. D. Defense -- Bryce 10 *Nominated for Best Student Paper Award
Extract Relaxed Plans from each Multiple Graphs (MG) P 0 avail(soil, ) at( ) A 0 3 drive( , ) sample(soil, ) drive( , ) P 1 avail(soil, ) A 1 drive( , ) at( ) drive( , ) sample(soil, ) have(soil) commun(soil) avail(soil, ) at( ) 3 drive( , ) sample(soil, ) drive( , ) at( ) drive( , ) Build A planning Graph For each State in the belief state 4/20/07 A 2 1 at( ) drive( , ) sample(soil, ) drive( , ) Ph. D. Defense -- Bryce P 3 at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) drive( , ) sample(soil, ) drive( , ) -Sampling a subset can help avail(soil, ) a better representation avail(soil, ) -Or, Need drive( , ) at( ) drive( , ) avail(soil, ) P 2 avail(soil, ) COSTLY!! at( ) h=7 avail(soil, ) at( ) Step-wise union relaxed plans at( ) have(soil) drive( , ) commun(soil) drive( , ) sample(soil, ) drive( , ) commun(soil) avail(soil, ) at( ) have(soil) comm(soil) avail(soil, ) at( ) have(soil) 11 comm(soil)
Contributions: Heuristic Search [ICAPS-04, JAIR 06] P 0 avail(soil, ) at( ) A 0 P 1 drive( , ) avail(soil, ) at( ) have(soil) at( ) avail(soil, ) drive( , ) at( ) avail(soil, ) at( ) P 2 at( ) drive( , ) sample(soil, ) commun(soil) drive( , ) sample(soil, ) drive( , ) avail(soil, ) at( ) drive( , ) avail(soil, ) drive( , ) at( ) sample(soil, ) at( ) drive( , ) at( ) Ch. 3 P 0 P 3 A 0 P 1 avail(soil, ) at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) avail(soil, ) drive( , ) avail(soil, ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) : s sample(soil, ) : r comm(soil) drive( , ) avail(soil, ) at( ) drive( , ) commun(soil) sample(soil, ) at( ) drive( , ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) have(soil) at( ) drive( , ) at( ) comm(soil) drive( , ) at( ) have(soil) commun(soil) at( ) have(soil) Ch. 4 drive( , ) at( ) 3 5 [ICAPS-06*, AIJ] 4/20/07 drive( , ) drive( , ) Ch. 7 1 5 o G G 1 3 5 o 1 2 o 3 4 o 5 6 o G 1 2 3 4 5 6 o 1 o 2 2 3 o 3 o 4 4 o 5 5 o 6 6 7 1 : s 10 0(B) B 0 20 0 C s 20 0(C) C 0(C) 27 : r 7 0(R) 0 0 : r at( ) 10 0 : s 24 1 0 0 R r 0 R 17 7 0(R) r 0 : r Ch. 5 Cost Sensitive Heuristics Pr(G| 2) = 0. 5 E[Cost( 2)] = 10 1 G o G 0 : s at( ) drive( , ) 0(B) B comm(soil) 0 s 0 10 0 at( ) 0 s have(soil) commun(soil) G 1 3 0. 4 0. 5 0. 1 Monte Carlo in Planning Graphs 0 s Compressing Planning Graphs Planning Graph Estimates of BS Distance Ch. 6 P 3 avail(soil, ) at( ) drive( , ) at( ) A 2 avail(soil, ) sample(soil, ) drive( , ) commun(soil) P 2 [UAI-05] avail(soil, ) sample(soil, ) drive( , ) at( ) A 1 avail(soil, ) drive( , ) avail(soil, ) A 2 drive( , ) sample(soil, ) avail(soil, ) A 1 [AAAI-WS-04, JAIR 06] Pr(G| 3) = 0. 75 E[Cost( 3)] = 30 Pr(G| 4) = 0. 0 E[Cost( 4)] = 0 3 2 Pr(G) 1 Pr(G) 4 2 3 2 4 3 4 E[Cost] a 0. 2 5 E[Cost] 0. 8 Pr(G) ( 1, 3) ( 2, 3) 6 7 State Agnostic Planning Graphs [AAAI-05*] Ch. 8 ( 1, 4) ( 2, 4) E[Cost] Multi-Objective LAO* [In preparation] Ph. D. Defense -- Bryce 12 *Nominated for Best Student Paper Award
Union to Minimize Overlapping Structure PP 0 0 avail(soil, ) at( ) avail(soil, ) A 00 drive( , ) sample(soil, ) avail(soil, ) at( ) drive( , ) avail(soil, ) at( ) P 11 P avail(soil, ) at( ) avail(soil, ) have(soil) avail(soil, ) at( ) avail(soil, ) at( ) drive( , ) at( ) avail(soil, ) at( ) drive( , ) 4/20/07 avail(soil, ) at( ) at( ) A 1 drive( , ) sample(soil, ) commun(soil) sample(soil, drive( , ) ) drive( , ) sample(soil, ) drive( , ) commun(soil) drive( , ) drive( , ) drive( , ) ) drive( , sample(soil, ) drive( , ) ) drive( , Ph. D. Defense -- Bryce PP 2 2 avail(soil, ) at( ) avail(soil, ) comm(soil) have(soil) avail(soil, ) at( ) comm(soil) at( ) have(soil) at( ) avail(soil, at( ) ) at( ) have(soil) AA 22 PP 3 3 avail(soil, ) sample(soil, drive( , ) ) drive( , ) sample(soil, ) drive( , ) commun(soil) drive( , ) ) drive( , ) commun(soil) drive( , ) ) sample(soil, ) drive( , ) drive( , ) ) drive( , commun(soil) have(soil) avail(soil, ) at( ) comm(soil) at( ) have(soil) at( ) comm(soil) avail(soil, at( ) ) at( ) have(soil) 13 comm(soil)
Labeled Planning Graph (LUG) P 0 A 0 P 1 avail(soil, ) at( ) P 2 A 2 P 3 avail(soil, ) avail(soil, ) avail(soil, ) sample(soil, ) avail(soil, ) at( ) A 1 at( ) Æ ( avail(soil, ) Ç avail(soil, ) )Æ : at( ) Æ … have(soil) sample(soil, ) comm(soil) commun(soil) drive( , ) 4/20/07 have(soil) comm(soil) at( ) drive( , ) at( ) drive( , ) Labels correspond To sets of states, and Are represented as Propositional formulas sample(soil, ) at( ) drive( , ) have(soil) COMPACT!! drive( , ) at( ) sample(soil, ) drive( , ) at( ) drive( , ) drive( , ) Ph. D. Defense -- Bryce 14
Total Time (hours) Comparison of Planning Graph Types[JAIR, 2006] LUG saves Time Sampling more worlds, Increases cost of MG Sampling more worlds, Improves effectiveness of LUG % States Sampled for Heuristic 4/20/07 Ph. D. Defense -- Bryce 15
Contributions: Heuristic Search [ICAPS-04, JAIR 06] P 0 avail(soil, ) at( ) A 0 P 1 drive( , ) avail(soil, ) at( ) have(soil) at( ) avail(soil, ) drive( , ) at( ) avail(soil, ) at( ) P 2 at( ) drive( , ) sample(soil, ) commun(soil) drive( , ) sample(soil, ) drive( , ) avail(soil, ) at( ) drive( , ) avail(soil, ) drive( , ) at( ) sample(soil, ) at( ) drive( , ) at( ) Ch. 3 P 0 P 3 A 0 P 1 avail(soil, ) at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) avail(soil, ) drive( , ) avail(soil, ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) : s sample(soil, ) : r comm(soil) drive( , ) avail(soil, ) at( ) drive( , ) commun(soil) sample(soil, ) at( ) drive( , ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) have(soil) at( ) drive( , ) at( ) comm(soil) drive( , ) at( ) have(soil) commun(soil) at( ) have(soil) Ch. 4 drive( , ) at( ) 3 5 [ICAPS-06*, AIJ] 4/20/07 drive( , ) drive( , ) Ch. 7 1 5 o G G 1 3 5 o 1 2 o 3 4 o 5 6 o G 1 2 3 4 5 6 o 1 o 2 2 3 o 3 o 4 4 o 5 5 o 6 6 7 1 : s 10 0(B) B 0 20 0 C s 20 0(C) C 0(C) 27 : r 7 0(R) 0 0 : r at( ) 10 0 : s 24 1 0 0 R r 0 R 17 7 0(R) r 0 : r Ch. 5 Cost Sensitive Heuristics Pr(G| 2) = 0. 5 E[Cost( 2)] = 10 1 G o G 0 : s at( ) drive( , ) 0(B) B comm(soil) 0 s 0 10 0 at( ) 0 s have(soil) commun(soil) G 1 3 0. 4 0. 5 0. 1 Monte Carlo in Planning Graphs 0 s Compressing Planning Graphs Planning Graph Estimates of BS Distance Ch. 6 P 3 avail(soil, ) at( ) drive( , ) at( ) A 2 avail(soil, ) sample(soil, ) drive( , ) commun(soil) P 2 [UAI-05] avail(soil, ) sample(soil, ) drive( , ) at( ) A 1 avail(soil, ) drive( , ) avail(soil, ) A 2 drive( , ) sample(soil, ) avail(soil, ) A 1 [AAAI-WS-04, JAIR 06] Pr(G| 3) = 0. 75 E[Cost( 3)] = 30 Pr(G| 4) = 0. 0 E[Cost( 4)] = 0 3 2 Pr(G) 1 Pr(G) 4 2 3 2 4 3 4 E[Cost] a 0. 2 5 E[Cost] 0. 8 Pr(G) ( 1, 3) ( 2, 3) 6 7 State Agnostic Planning Graphs [AAAI-05*] Ch. 8 ( 1, 4) ( 2, 4) E[Cost] Multi-Objective LAO* [In preparation] Ph. D. Defense -- Bryce 16 *Nominated for Best Student Paper Award
Propagating Cost on Planning Graphs § Classical Actions: Aggregate Preconditions 10 p 15 q A Max: max(10, 15, 3) = 15 3 r Sum: 10+15+3=28 § Labeled 2 r B p 20 C 4 3 1 3 4/20/07 Propositions: Min Cost Supporter Min(6+c(A), 9+c(B), 20+c(C)) = 6 +c(A) Too Many Costs Actions: Aggregate Preconditions 5 9 p A 13 6 4 q 10 2 6 9 A 2 Propositions: Min Cost Supporter A 7 B 9 C Ph. D. Defense -- Bryce p 1 2 7 17
Grouped Costs § Aggregating Groups (Set Cover) 9 2 Actions: Aggregate Preconditions p 5 q A 32 4 4 r 10 2 Propositions: Min Cost Supporter 3 A 7 9 B p 7 7 C § Defining Groups (based on Label Change) p p q q r r 4/20/07 New Worlds Are grouped At each level Ph. D. Defense -- Bryce 18
Relaxed Plan Extraction § Coverage § Cost Residual Cost Decreases 3 A B 51 p C 3 4 Pick Action That Covers Most New Worlds 4/20/07 A B p C Pick Action That Covers A New World At Least Cost Ph. D. Defense -- Bryce 19
The Medical Specialist Low Cost Sensors Average Path Cost Total Time Cost Gives Better Plans, With Good Run Time High Cost Sensors Cost Coverage 4/20/07 Ph. D. Defense -- Bryce 20
Contributions: Heuristic Search [ICAPS-04, JAIR 06] P 0 avail(soil, ) at( ) A 0 P 1 drive( , ) avail(soil, ) at( ) have(soil) at( ) avail(soil, ) drive( , ) at( ) avail(soil, ) at( ) P 2 at( ) drive( , ) sample(soil, ) commun(soil) drive( , ) sample(soil, ) drive( , ) avail(soil, ) at( ) drive( , ) avail(soil, ) drive( , ) at( ) sample(soil, ) at( ) drive( , ) at( ) Ch. 3 P 0 P 3 A 0 P 1 avail(soil, ) at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) avail(soil, ) drive( , ) avail(soil, ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) : s sample(soil, ) : r comm(soil) drive( , ) avail(soil, ) at( ) drive( , ) commun(soil) sample(soil, ) at( ) drive( , ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) have(soil) at( ) drive( , ) at( ) comm(soil) drive( , ) at( ) have(soil) commun(soil) at( ) have(soil) Ch. 4 drive( , ) at( ) 3 5 [ICAPS-06*, AIJ] 4/20/07 drive( , ) drive( , ) Ch. 7 1 5 o G G 1 3 5 o 1 2 o 3 4 o 5 6 o G 1 2 3 4 5 6 o 1 o 2 2 3 o 3 o 4 4 o 5 5 o 6 6 7 1 : s 10 0(B) B 0 20 0 C s 20 0(C) C 0(C) 27 : r 7 0(R) 0 0 : r at( ) 10 0 : s 24 1 0 0 R r 0 R 17 7 0(R) r 0 : r Ch. 5 Cost Sensitive Heuristics Pr(G| 2) = 0. 5 E[Cost( 2)] = 10 1 G o G 0 : s at( ) drive( , ) 0(B) B comm(soil) 0 s 0 10 0 at( ) 0 s have(soil) commun(soil) G 1 3 0. 4 0. 5 0. 1 Monte Carlo in Planning Graphs 0 s Compressing Planning Graphs Planning Graph Estimates of BS Distance Ch. 6 P 3 avail(soil, ) at( ) drive( , ) at( ) A 2 avail(soil, ) sample(soil, ) drive( , ) commun(soil) P 2 [UAI-05] avail(soil, ) sample(soil, ) drive( , ) at( ) A 1 avail(soil, ) drive( , ) avail(soil, ) A 2 drive( , ) sample(soil, ) avail(soil, ) A 1 [AAAI-WS-04, JAIR 06] Pr(G| 3) = 0. 75 E[Cost( 3)] = 30 Pr(G| 4) = 0. 0 E[Cost( 4)] = 0 3 2 Pr(G) 1 Pr(G) 4 2 3 2 4 3 4 E[Cost] a 0. 2 5 E[Cost] 0. 8 Pr(G) ( 1, 3) ( 2, 3) 6 7 State Agnostic Planning Graphs [AAAI-05*] Ch. 8 ( 1, 4) ( 2, 4) E[Cost] Multi-Objective LAO* [In preparation] Ph. D. Defense -- Bryce 21 *Nominated for Best Student Paper Award
Search in Probabilistic Belief State Space 0. 04 0. 36 sample(soil, ) 0. 4 0. 5 0. 1 avail(soil, ) at( ) avail(soil, ) 0. 04 avail(soil, ) at( ) have(soil) 0. 5 0. 1 avail(soil, ) 0. 04 avail(soil, ) at( ) 0. 36 avail(soil, ) at( ) have(soil) 0. 5 avail(soil, ) at( ) 0. 05 avail(soil, ) at( ) avail(soil, ) 0. 45 avail(soil, ) at( ) have(soil) drive( , ) avail(soil, ) at( ) 0. 1 drive( , ) 0. 4 4/20/07 at( ) drive( , ) sample(soil, ) avail(soil, ) at( ) 0. 1 avail(soil, ) at( ) avail(soil, ) A Belief State is a Probability Distribution over States at( ) Ph. D. Defense -- Bryce 22
An Abstract View (CGP-style) Planning Graph is a tree Of Deterministic Planning Graph Branches A 1 0. 4 0. 5 0. 1 Wait! ? o 1(A 1) p q g o 2(A 1) p q Exact Probabilistic Inference with Approximate Logical Inference? ! For a Heuristic? 4/20/07 p q g A 2 Ph. D. Defense -- Bryce ? o 1(A 2) o 2(A 2) p q 23
Monte Carlo Pr(G) = 5/16 = 0. 3125 Pr(G) = 8/16 = 0. 5 Pr(G) = 13/16 = 0. 8125 0. 4 0. 5 0. 1 4/20/07 • Problem: • Have Multiple Planning Graphs, Which can still be costly • Solution: • Use Labeled Planning Graph Ph. D. Defense -- Bryce 24
Monte Carlo LUG (Mc. LUG) -- Initial Layer N=4 avail(soil, ) at( ) 0. 4 0. 5 0. 1 avail(soil, ) at( ) avail(soil, ) at( ) avail(soil, ) at( ) Pick a State for Each sample 4/20/07 Ph. D. Defense -- Bryce Form Initial Layer 25
Mc. LUG for Rover example P 0 A 0 P 1 A 1 avail(soil, ) sample(soil, ) have(soil) P 2 at( ) drive( , ) at( ) avail(soil, ) sample(soil, ) have(soil) comm(soil) sample(soil, ) commun(soil) drive( , ) have(soil) comm(soil) at( ) drive( , ) at( ) 4/20/07 P 3 avail(soil, ) drive( , ) Each Samples must simulate action outcome A 2 avail(soil, ) sample(soil, ) drive( , ) [ICAPS 2006] drive( , ) ¾ of particles ¼ of particles Support goal, need drive( , )Support goal, Okay to stop At least ½ drive( , ) Ph. D. Defense -- Bryce 26
Grid Domain Results Good scalability and quality! Grid(0. 5) Total Time (s) Grid(0. 8) [ICAPS 2006] Pr(Goal) Plan Length Pr(Goal) Need More Particles when More Stochastic Previous State of the Art (based on CSP), POMDP algorithms are Not Competitive 4/20/07 Pr(Goal) Ph. D. Defense -- Bryce Pr(Goal) 27
Contributions: Heuristic Search [ICAPS-04, JAIR 06] P 0 avail(soil, ) at( ) A 0 P 1 drive( , ) avail(soil, ) at( ) have(soil) at( ) avail(soil, ) drive( , ) at( ) avail(soil, ) at( ) P 2 at( ) drive( , ) sample(soil, ) commun(soil) drive( , ) sample(soil, ) drive( , ) avail(soil, ) at( ) drive( , ) avail(soil, ) drive( , ) at( ) sample(soil, ) at( ) drive( , ) at( ) Ch. 3 P 0 P 3 A 0 P 1 avail(soil, ) at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) avail(soil, ) drive( , ) avail(soil, ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) : s sample(soil, ) : r comm(soil) drive( , ) avail(soil, ) at( ) drive( , ) commun(soil) sample(soil, ) at( ) drive( , ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) have(soil) at( ) drive( , ) at( ) comm(soil) drive( , ) at( ) have(soil) commun(soil) at( ) have(soil) Ch. 4 drive( , ) at( ) 3 5 [ICAPS-06*, AIJ] 4/20/07 drive( , ) drive( , ) Ch. 7 1 5 o G G 1 3 5 o 1 2 o 3 4 o 5 6 o G 1 2 3 4 5 6 o 1 o 2 2 3 o 3 o 4 4 o 5 5 o 6 6 7 1 : s 10 0(B) B 0 20 0 C s 20 0(C) C 0(C) 27 : r 7 0(R) 0 0 : r at( ) 10 0 : s 24 1 0 0 R r 0 R 17 7 0(R) r 0 : r Ch. 5 Cost Sensitive Heuristics Pr(G| 2) = 0. 5 E[Cost( 2)] = 10 1 G o G 0 : s at( ) drive( , ) 0(B) B comm(soil) 0 s 0 10 0 at( ) 0 s have(soil) commun(soil) G 1 3 0. 4 0. 5 0. 1 Monte Carlo in Planning Graphs 0 s Compressing Planning Graphs Planning Graph Estimates of BS Distance Ch. 6 P 3 avail(soil, ) at( ) drive( , ) at( ) A 2 avail(soil, ) sample(soil, ) drive( , ) commun(soil) P 2 [UAI-05] avail(soil, ) sample(soil, ) drive( , ) at( ) A 1 avail(soil, ) drive( , ) avail(soil, ) A 2 drive( , ) sample(soil, ) avail(soil, ) A 1 [AAAI-WS-04, JAIR 06] Pr(G| 3) = 0. 75 E[Cost( 3)] = 30 Pr(G| 4) = 0. 0 E[Cost( 4)] = 0 3 2 Pr(G) 1 Pr(G) 4 2 3 2 4 3 4 E[Cost] a 0. 2 5 E[Cost] 0. 8 Pr(G) ( 1, 3) ( 2, 3) 6 7 State Agnostic Planning Graphs [AAAI-05*] Ch. 8 ( 1, 4) ( 2, 4) E[Cost] Multi-Objective LAO* [In preparation] Ph. D. Defense -- Bryce 28 *Nominated for Best Student Paper Award
Example Plan at( ) P 0 drive( , ) A 0 at( ) sample(image, ) P 1 A 1 drive( , ) at( ) drive( , ) sample(image, ) have(image) at( ) sample(image, ) have(image) commun(image) comm(image) Ph. D. Defense -- Bryce P 3 at( ) commun(image) 4/20/07 A 2 drive( , ) sample(image, ) have(image) at( ) have(image) comm(image) at( ) P 2 drive( , ) comm(image) commun(image) at( ) have(image) commun(image) comm(image) 29
Classical Planning Results SAG better PG better 4/20/07 Ph. D. Defense -- Bryce 30
Non-Determinsitic Planning at( ) drive( , ) have(image) at( ) P 0 P 1 A 1 drive( , ) at( ) drive( , ) sample(image, ) at( ) sample(image, ) have(image) commun(image) 4/20/07 at( ) have(image) commun(image) comm(image) Ph. D. Defense -- Bryce P 3 drive( , ) sample(image, ) comm(image) A 2 at( ) drive( , ) have(image) P 2 drive( , ) at( ) comm(image) have(image) at( ) A 0 commun(image) sample(image, ) commun(image) comm(image) 31
Fifth International Planning Competition Versatile, Competitive Heuristics 4/20/07 Ph. D. Defense -- Bryce 32
Stochastic Planning commun(image) 0. 1 at( ) 0. 9 have(image) at( ) P 0 have(image) at( ) 0. 9 A 0 sample(image, ) 0. 01 at( ) drive( , ) 0. 1 P 1 A 1 drive( , ) at( ) 0. 99 comm(image) have(image) at( ) P 2 A 2 at( ) drive( , ) sample(image, ) have(image) at( ) sample(image, ) have(image) commun(image) comm(image) Ph. D. Defense -- Bryce P 3 drive( , ) at( ) commun(image) 4/20/07 have(image) at( ) drive( , ) sample(image, ) have(image) at( ) 0. 01 drive( , ) comm(image) 0. 99 at( ) commun(image) comm(image) 33
Stochastic Planning -- Mc. SAG commun(image) 0. 1 at( ) 0. 9 have(image) at( ) have(image) at( ) comm(image) have(image) at( ) 4/20/07 drive( , ) 0. 1 0. 9 at( ) have(image) at( ) sample(image, ) 0. 01 0. 99 at( ) 0. 01 at( ) have(image) at( ) 0. 99 comm(image) have(image) at( ) Each Belief State Samples from the pool Sample N Planning Graphs for each state To create a pool of planning graphs S: states N: samples per search node A: actions k: levels Ph. D. k. Defense -- Bryce drawn to build Mc. SAG 34 O(SNA ) samples
Common Sample SAG have(image) at( ) P 0 A 0 at( ) P 1 A 1 drive( , ) at( ) N: samples per search node A: actions k: layers k) samples O(NA comm(image) have(image) at( ) drive( , ) at( ) sample(image, ) have(image) commun(image) comm(image) P 3 drive( , ) at( ) have(image) A 2 at( ) drive( , ) at( ) P 2 drive( , ) at( ) have(image) at( ) commun(image) comm(image) kth Sample of each state uses the same action outcomes 4/20/07 Ph. D. Defense -- Bryce 35
CSSAG vs. Mc. LUG – Total Time (s) Mc. LUG CSSAG Greatly Improved Scalability 4/20/07 Ph. D. Defense -- Bryce 36
Contributions: Heuristic Search [ICAPS-04, JAIR 06] P 0 avail(soil, ) at( ) A 0 P 1 drive( , ) avail(soil, ) at( ) have(soil) at( ) avail(soil, ) drive( , ) at( ) avail(soil, ) at( ) P 2 at( ) drive( , ) sample(soil, ) commun(soil) drive( , ) sample(soil, ) drive( , ) avail(soil, ) at( ) drive( , ) avail(soil, ) drive( , ) at( ) sample(soil, ) at( ) drive( , ) at( ) Ch. 3 P 0 P 3 A 0 P 1 avail(soil, ) at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) avail(soil, ) drive( , ) avail(soil, ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) : s sample(soil, ) : r comm(soil) drive( , ) avail(soil, ) at( ) drive( , ) commun(soil) sample(soil, ) at( ) drive( , ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) have(soil) at( ) drive( , ) at( ) comm(soil) drive( , ) at( ) have(soil) commun(soil) at( ) have(soil) Ch. 4 drive( , ) at( ) 3 5 [ICAPS-06*, AIJ] 4/20/07 drive( , ) drive( , ) Ch. 7 1 5 o G G 1 3 5 o 1 2 o 3 4 o 5 6 o G 1 2 3 4 5 6 o 1 o 2 2 3 o 3 o 4 4 o 5 5 o 6 6 7 1 : s 10 0(B) B 0 20 0 C s 20 0(C) C 0(C) 27 : r 7 0(R) 0 0 : r at( ) 10 0 : s 24 1 0 0 R r 0 R 17 7 0(R) r 0 : r Ch. 5 Cost Sensitive Heuristics Pr(G| 2) = 0. 5 E[Cost( 2)] = 10 1 G o G 0 : s at( ) drive( , ) 0(B) B comm(soil) 0 s 0 10 0 at( ) 0 s have(soil) commun(soil) G 1 3 0. 4 0. 5 0. 1 Monte Carlo in Planning Graphs 0 s Compressing Planning Graphs Planning Graph Estimates of BS Distance Ch. 6 P 3 avail(soil, ) at( ) drive( , ) at( ) A 2 avail(soil, ) sample(soil, ) drive( , ) commun(soil) P 2 [UAI-05] avail(soil, ) sample(soil, ) drive( , ) at( ) A 1 avail(soil, ) drive( , ) avail(soil, ) A 2 drive( , ) sample(soil, ) avail(soil, ) A 1 [AAAI-WS-04, JAIR 06] Pr(G| 3) = 0. 75 E[Cost( 3)] = 30 Pr(G| 4) = 0. 0 E[Cost( 4)] = 0 3 2 Pr(G) 1 Pr(G) 4 2 3 2 4 3 4 E[Cost] a 0. 2 5 E[Cost] 0. 8 Pr(G) ( 1, 3) ( 2, 3) 6 7 State Agnostic Planning Graphs [AAAI-05*] Ch. 8 ( 1, 4) ( 2, 4) E[Cost] Multi-Objective LAO* [In preparation] Ph. D. Defense -- Bryce 37 *Nominated for Best Student Paper Award
Multi-Objective Probabilistic Planning 1 Pr(G) 1 Pr(G| 3) = 0. 75 E[Cost( 3)] = 30 Pr(G| 2) = 0. 5 E[Cost( 2)] = 10 Pr(G| 4) = 0. 0 E[Cost( 4)] = 0 3 2 4 2 Pr(G) 3 4 E[Cost] Not Obvious, Which sub-plan Is better here – Depends on other branch 4/20/07 a 0. 2 E[Cost] 0. 8 Ph. D. Defense -- Bryce Pr(G) a, [ 1| 3] a, [ 2| 3] a, [ 1| 4] a, [ 2| 4] E[Cost] 38
Multi-objective Dynamic Programming § J(b) = {q(b, a) |: 9 q’(b, a’) 2 J(b) q’(b, a’)Áq(b, a)} § q’(b, a’) Á q(b, a) , 8 i qi’(b, a’) · qi(b, a) Æ 9 i qi’(b, a’) < qi(b, a) § Expected Cost § q 0(b, a) = c(a) + o 2 T(b, a, o, bao)q 0’(bao, a’) § q 0(b, ? ) = 0 § Pr(G) § q 1(b, a) = o 2 T(b, a, o, bao)q 1’(bao, a’) § q 1(b, ? ) = s 2 b: G µ s b(s) 4/20/07 Ph. D. Defense -- Bryce 39
Search Example -- Initially Initialize Root Pareto Set with null plan and heuristic estimate 4/20/07 Pr(G) 0. 0 Ph. D. Defense -- Bryce C 40
Search Example – 1 st Expansion Expand Root Node and Initialize Pareto Sets of Children with null plan And Heuristic Estimate Pr(G) 0. 0 C 0. 8 a 2 a 1 0. 0 C C 0. 2 Pr(G) 0. 0 4/20/07 Ph. D. Defense -- Bryce C 41
Search Example – 1 st Revision Recompute Pareto Set For Root, find best heuristic Point is through a 1 Pr(G) 0. 0 C 0. 8 a 2 a 1 0. 0 C C 0. 2 Pr(G) a 1 0. 0 4/20/07 Ph. D. Defense -- Bryce C 42
Search Example – 2 nd Expansion Expand Children of a 1 and initialize their Pareto Sets with null plan and Heuristic estimate – Both children Satisfy the Goal with non-zero probability 0. 7 Pr(G) 0. 5 C C a 4 Pr(G) 0. 0 C 0. 8 a 2 a 1 0. 0 C C 0. 2 Pr(G) a 1 0. 0 4/20/07 a 3 Ph. D. Defense -- Bryce C 43
Search Example – 2 nd Revision Recompute Pareto Set of both expanded nodes and the root node – There is a feasible plan a 1, [a 4, a 3] that satisfies the goal with 0. 66 probability and cost 2. The heuristic estimate indicates extending a 1, [a 4, a 3] will lead to a plan that satisfies the goal with 1. 0 probability 0. 7 Pr(G) 0. 5 C C a 4 Pr(G) 0. 0 C 0. 8 a 2 a 1 a 4 a 3 0. 0 C C 0. 2 Pr(G) a 1, [a 4|a 3] 0. 0 4/20/07 a 3 Ph. D. Defense -- Bryce C 44
Search Example – 3 rd Expansion Expand Plan to include a 7. There is no applicable action after a 3 0. 9 Pr(G) C a 7 0. 7 Pr(G) 0. 5 C C a 4 Pr(G) 0. 0 C 0. 8 a 2 a 1 a 4 a 3 0. 0 C C 0. 2 Pr(G) a 1, [a 4|a 3] 0. 0 4/20/07 a 3 Ph. D. Defense -- Bryce C 45
Search Example – 3 rd Revision Recompute all Pareto Sets that are Ancestors of Expanded Nodes. Heuristic for plans extended through a 3 is higher because of no applicable action. Heuristic at root node changes to plans extended through a 2 0. 9 Pr(G) C a 7 Pr(G) 0. 7 Pr(G) a 7 , a 7 C a 4 Pr(G) 0. 0 C 0. 8 a 2 a 1 Pr(G) a , a a 4, a 7 4 7 a 4 C a 3 0. 5 C a 3 Pr(G) a 3 0. 0 C 0. 2 Pr(G) 0. 0 4/20/07 Ph. D. Defense -- Bryce a 2 a 1, [a 4, a 7|a 3] a 1, [a 4|a 3] C 46
Search Example – 4 th Expansion 0. 9 Expand Plan through a 2, one expanding child satisfies the goal with 0. 1 probability. Pr(G) C 0. 1 Pr(G) a 6 a 5 Pr(G) 0. 0 C a 7 Pr(G) 0. 0 Pr(G) 0. 7 C Pr(G) a 7 , C a 4 0. 0 C 0. 8 a 2 a 1 Pr(G) a , a a 4, a 7 4 7 a 4 C a 3 0. 5 C a 3 Pr(G) a 3 0. 0 C 0. 2 Pr(G) 0. 0 4/20/07 Ph. D. Defense -- Bryce a a 1, [a 42, a 7|a 3] a 1, [a 4|a 3] C 47
Search Example – 4 th Revision 0. 9 Recompute Pareto sets of expanded Ancestors. Plan a 2, a 5 is dominated at the root. C 0. 1 a 5 0. 7 C a 5 Pr(G) a 7 C 0. 0 C 0. 8 a 2 a 1 Pr(G) a , a a 4, a 7 4 7 a 4 C a 3 0. 5 C a 3 Pr(G) a 3 0. 0 C 0. 2 0. 0 4/20/07 a 4 Pr(G) a 6 0. 0 C a 7 Pr(G) a 6 Pr(G) 0. 0 Pr(G) Ph. D. Defense -- Bryce Pr(G) a , a a 1, [a 42, a 76|a 3] a 1, [a 4|a 3] a 2, a 5 C 48
Search Example – 5 th Expansion Pr(G) 0. 6 0. 9 C a 8 Pr(G) C 0. 1 Pr(G) a 6 a 5 Pr(G) 0. 0 a 5 C a 7 Pr(G) 0. 0 Pr(G) 0. 7 C 0. 0 C 0. 8 a 2 a 1 Pr(G) a , a a 4, a 7 4 7 a 4 C a 3 0. 5 C a 3 Pr(G) a 3 0. 0 C 0. 2 Pr(G) 0. 0 4/20/07 a 4 a 6 Expand Plan through a 6 Pr(G) a 7 C Ph. D. Defense -- Bryce a 2, a 6 a 1, [a 4, a 7|a 3] a 1, [a 4|a 3] C 49
Search Example – 5 th Revision 0. 6 Pr(G) 0. 9 C a 8 0. 0 Pr(G) a 8 C Pr(G) 0. 1 a 5 Pr(G) 0. 7 C 0. 0 0. 8 4/20/07 Pr(G) a 7 C a 4 Pr(G) a 6, a 8 0. 0 a 5 C Recompute Pareto Sets. Plans a 2, a 6, a 8, and a 2, a 5 are dominated at root. C a 7 a 6 Pr(G) a 2 a 1 a 3 Pr(G) a , a a 4, a 7 4 7 a 4 C a 3 0. 5 C a 3 0. 0 C 0. 2 0. 0 Ph. D. Defense -- Bryce Pr(G) a 2, a 6, a 8 a 1, [a 4, a 7|a 3] a 1, [a 4|a 3] a 2, a 6, a 8 a 2, a 5 C 50
Search Example – Final Pr(G) 0. 6 0. 9 C a 8 0. 0 Pr(G) a 8 C Pr(G) 0. 1 a 5 Pr(G) 0. 7 C Pr(G) a 7 C a 6, a 8 a 5 C 0. 0 0. 8 a 2 a 1 Pr(G) a , a a 4, a 7 4 7 a 4 C a 3 0. 5 C a 3 Pr(G) a 3 0. 0 C 0. 2 Pr(G) 0. 0 4/20/07 a 4 Pr(G) 0. 0 C a 7 a 6 Pr(G) Ph. D. Defense -- Bryce a 2, a 6, a 8 a 1, [a 4, a 7|a 3] a 1, [a 4|a 3] C 51
Speed-ups § -domination [Papadimtriou & Yannakakis, 2003] § q’(b, a’) Á q(b, a) , 8 i qi’(b, a’)·(1+ )qi(b, a) § Randomized Node Expansions § Simulate Partial Plan to Expand a single node § Reachability Heuristics § Use the CSSAG 4/20/07 Ph. D. Defense -- Bryce 52
-domination 1 -Pr(G) Check Domination Non-Dominated Each Hyper-Rectangle Has a single point Multiply Each Objective By (1+ ) Dominated x y Cost y/x = 1+ 4/20/07 Ph. D. Defense -- Bryce 53
-domination Planning Time shrinks with increase in epsilon Heterogenous branch lengths improves expected cost, if possible 4/20/07 Ph. D. Defense -- Bryce 54
With Heuristics help scale Plans are non-trivial 4/20/07 Ph. D. Defense -- Bryce Randomized expansions help 55
Limited Contingency Planning § Contingencies § q 2(b, a) = o 2 q 2’(bao, a’) § q 2(b, ? ) = 1 4/20/07 Ph. D. Defense -- Bryce 56
Contributions: Heuristic Search [ICAPS-04, JAIR 06] P 0 avail(soil, ) at( ) A 0 P 1 drive( , ) avail(soil, ) at( ) have(soil) at( ) avail(soil, ) drive( , ) at( ) avail(soil, ) at( ) P 2 at( ) drive( , ) sample(soil, ) commun(soil) drive( , ) sample(soil, ) drive( , ) avail(soil, ) at( ) drive( , ) avail(soil, ) drive( , ) at( ) sample(soil, ) at( ) drive( , ) at( ) Ch. 3 P 0 P 3 A 0 P 1 avail(soil, ) at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) avail(soil, ) drive( , ) avail(soil, ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) : s sample(soil, ) : r comm(soil) drive( , ) avail(soil, ) at( ) drive( , ) commun(soil) sample(soil, ) at( ) drive( , ) sample(soil, ) have(soil) avail(soil, ) sample(soil, ) have(soil) at( ) drive( , ) at( ) comm(soil) drive( , ) at( ) have(soil) commun(soil) at( ) have(soil) Ch. 4 drive( , ) at( ) 3 5 [ICAPS-06*, AIJ] 4/20/07 drive( , ) drive( , ) Ch. 7 1 5 o G G 1 3 5 o 1 2 o 3 4 o 5 6 o G 1 2 3 4 5 6 o 1 o 2 2 3 o 3 o 4 4 o 5 5 o 6 6 7 1 : s 10 0(B) B 0 20 0 C s 20 0(C) C 0(C) 27 : r 7 0(R) 0 0 : r at( ) 10 0 : s 24 1 0 0 R r 0 R 17 7 0(R) r 0 : r Ch. 5 Cost Sensitive Heuristics Pr(G| 2) = 0. 5 E[Cost( 2)] = 10 1 G o G 0 : s at( ) drive( , ) 0(B) B comm(soil) 0 s 0 10 0 at( ) 0 s have(soil) commun(soil) G 1 3 0. 4 0. 5 0. 1 Monte Carlo in Planning Graphs 0 s Compressing Planning Graphs Planning Graph Estimates of BS Distance Ch. 6 P 3 avail(soil, ) at( ) drive( , ) at( ) A 2 avail(soil, ) sample(soil, ) drive( , ) commun(soil) P 2 [UAI-05] avail(soil, ) sample(soil, ) drive( , ) at( ) A 1 avail(soil, ) drive( , ) avail(soil, ) A 2 drive( , ) sample(soil, ) avail(soil, ) A 1 [AAAI-WS-04, JAIR 06] Pr(G| 3) = 0. 75 E[Cost( 3)] = 30 Pr(G| 4) = 0. 0 E[Cost( 4)] = 0 3 2 Pr(G) 1 Pr(G) 4 2 3 2 4 3 4 E[Cost] a 0. 2 5 E[Cost] 0. 8 Pr(G) ( 1, 3) ( 2, 3) 6 7 State Agnostic Planning Graphs [AAAI-05*] Ch. 8 ( 1, 4) ( 2, 4) E[Cost] Multi-Objective LAO* [In preparation] Ph. D. Defense -- Bryce 57 *Nominated for Best Student Paper Award
Summary & Future Work § Summary § Future Work § Planning Graph Heuristics in Belief Space § Distance Measures § Uniform Cost § Non-Uniform Cost § Heuristic Estimates of Distance Measures § Planning and Execution § Incomplete Domain Theory § Observations in Reachability Heuristics § Decision Theoretic Heuristics § Diverse Plan Synthesis § Non-Deterministic § Probabilistic § Efficient computation of Heuristics § § Label Propagation Grouped Costs Monte Carlo State Agnosticism § Multi-objective Heuristic Search in Belief Space 4/20/07 Ph. D. Defense -- Bryce 58
Selected Relevant Publications § § § § D. Bryce, S. Kambhampati, and D. E. Smith. “Sequential Monte Carlo in Probabilistic Planning Reachability Heuristics”, Artificial Intelligence (accepted conditionally), 2007. D. Bryce, S. Kambhampati, and D. E. Smith. “Planning Graph Heuristics for Belief Space Search”, Journal of Artificial Intelligence Research, Volume 26, pages 35 -99, 2006. D. Bryce and S. Kambhampati. “A Tutorial on Planning Graph Based Reachability Heuristics”, AI Magazine. Spring 2007. D. Bryce, S. Kambhampati, and D. E. Smith. “Sequential Monte Carlo in Probabilistic Planning Reachability Heuristics”, In the Proceedings of the 17 th International Conference on Automated Planning and Scheduling (ICAPS-06), 2006. (Short-listed for Best Paper Award) W. Cushing and D. Bryce. “State Agnostic Planning Graphs and their application to beliefspace planning”, In the Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-05), 2005. (Short-listed for Best Paper Award) D. Bryce and S. Kambhampati. “Cost Sensitive Reachability Heuristics for Handling State Uncertainty”, In the Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI-05), 2005. D. Bryce and S. Kambhampati. “Heuristic Guidance Measures for Conformant Planning”, In the Proceedings of the 14 th International Conference on Automated Planning and Scheduling (ICAPS-04), 2004. D. Bryce, W. Cushing, and S. Kambhampati. “Probabilistic Planning is Multi-objective!”, Submitted to the Proceedings of the 18 th International Conference on Automated Planning and Scheduling (ICAPS-07), 2007. 4/20/07 Ph. D. Defense -- Bryce 59
Extras 4/20/07 Ph. D. Defense -- Bryce 60
Future Work: Planning and Execution § Need to Plan-to-Plan § Allocating Planning time requires estimate of effort needed § Planning Graph Heuristics should be able to help § Heuristics can also help decide what type of plan to generate § Conformant § Conditional § k-Contingency 4/20/07 Ph. D. Defense -- Bryce 61
Future Work: Incomplete Domain Theory § Previously incompleteness is in state knowledge § Action descriptions can be incomplete § Unknown preconditions § Unknown outcomes § Unknown effects § Belief Space includes beliefs over actions [Cheng & Amir, 06] § Heuristics must include cost of experimentation to learn enough about action descriptions 4/20/07 Ph. D. Defense -- Bryce 62
Future Work: Observations in Heuristics § Described Conformant reach ability heuristics § Many of the actions could be divided into branches, given the right observations § Could post-process relaxed plans to insert observations. § Determine which actions go into which branches § Compute expected cost for heuristic 4/20/07 Ph. D. Defense -- Bryce 63
Future Work: Decision Theoretic Heuristics § Estimate best plan, via action costs and goal reward § Requires deciding the right probability to satisfy goal § Connected with Partial Satisfaction Planning § Planning graph must propagate cost and probability information – which mutually interact § May need multi-objective propagation 4/20/07 Ph. D. Defense -- Bryce 64
Future Work: Diverse Plans § Diverse plans optimize different objectives § Equidistant Pareto optimal solutions are diverse § Focus Heuristic Search toward diverse solutions § Increase heuristic estimates of partial plans lying near complete plans Diverse 4/20/07 Not Diverse Ph. D. Defense -- Bryce 65
Anytime Behavior 4/20/07 Ph. D. Defense -- Bryce 66
Conformant Rover Problem (Incomplete Information) Microscopic Soil Sample g ? ? ? b 4/20/07 Ph. D. Defense -- Bryce 67
Classical Planning Rover Domain Pancam Image Sample Microscopic Soil Sample g b 4/20/07 Rock Core Sample Ph. D. Defense -- Bryce 68 Images Courtesy: Cornell
Stochastic Rover Example Microscopic Soil Sample g. 1 . 4 0. 9. 5 b 4/20/07 Ph. D. Defense -- Bryce 69
Conformant Rover Problem w/ Costs Microscopic Soil Sample 5 15 g 10 ? ? 20 ? ? 20 20 5 10 4/20/07 ? ? b 30 5 Ph. D. Defense -- Bryce 70
Classical Planning Rover Domain Pancam Image Sample g 4/20/07 Rock Core Sample Ph. D. Defense -- Bryce 71 Images Courtesy: Cornell
Conformant Rover Domain Pancam Image Sample g 4/20/07 Rock Core Sample Ph. D. Defense -- Bryce 72 Images Courtesy: Cornell
Stochastic Rover Domain Pancam Image Sample 0. 9 g 4/20/07 Rock Core Sample Ph. D. Defense -- Bryce 73 Images Courtesy: Cornell
Single Objective Belief Space Search § Pr(G) > 0 terminal Ideal: Pr(G| ) = 0. 5 E[Cost( )] = 1. 5 LAO*: Pr(G| ) = 0. 5 E[Cost( )] = 2. 0 s 1: 0. 0 s. G: 1. 0 Ideal: Pr(G) = 0. 5 E[Cost] = 1. 5 s 3: 1. 0 s. G: 0. 0 s 1: 0. 5 s. G: 0. 5 4/20/07 s 3: 0. 0 s. G: 0. 5 § Pr(G) = 1 terminal 0. 5 s 1: 1. 0 s. G: 0. 0 s 2: 1. 0 s. G: 0. 0 Using Nodes properties to ensure Plan feasibility is Incomplete LAO*: Pr(G) = 0. 0 E[Cost] = 1 No Applicable Action s 1: 0. 0 s. G: 1. 0 s 1: 0. 5 s. G: 0. 5 Ph. D. Defense -- Bryce s 2: 1. 0 s. G: 0. 0 0. 5 s 1: 1. 0 s. G: 0. 0 74
Uncertain Actions in the Planning Graph P 0 Generate a proposition A 0 layer for each joint Outcome of actions Initial Proposition Layer For Each Possible State avail(soil, ) 0. 4 avail(soil, ) P 1 at( ) drive( , ) sample(soil, ) A 1 drive( , ) at( ) drive( , ) sample(soil, ) have(soil) commun(soil) P 2 avail(soil, ) at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) at( ) avail(soil, ) 0. 5 0. 1 avail(soil, ) at( ) avail(soil, ) at( ) 4/20/07 at( ) Ph. D. Defense -- Bryce at( ) comm(soil) have(soil) avail(soil, ) at( ) have(soil) 75
Planning Under Uncertainty § Classical Planning § A* Search in State Space § PSPACE-Complete § Conformant Planning § A* Search in Belief State Space § Non-Deterministic: EXPSPACE-Complete § Probabilistic: Undecidable § Conditional Planning § LAO* Search in Belief State Space § Non-Deterministic: 2 -EXP-Complete § Probabilistic: Undecidable 4/20/07 Ph. D. Defense -- Bryce 76
Cost Propagation on LUG P 0 A 0 E 0 0 sample(soil, ) avail(soil, ) 20 00 Assign Costs to Group of Worlds at( ) E 1 0 drive( , ) Legend s s s 4/20/07 20 have(soil) at( ) 10 at( ) 30 at( ) A 2 E 2 P 3 avail(soil, ) 0 sample(soil, ) 10 sample(soil, ) 30 sample(soil, ) 0 commun(soil) 0 P 2 avail(soil, ) 0 A 1 avail(soil, ) P 1 20 00 30 00 50 00 25 00 0 drive( , ) 10 drive( , ) 30 drive( , ) Ph. D. Defense -- Bryce avail(soil, ) 20 80 have(soil) 25 comm(soil) 0 at( ) 10 at( ) 25 at( ) avail(soil, ) 0 sample(soil, ) 10 sample(soil, ) 25 sample(soil, ) 0 commun(soil) avail(soil, ) 20 00 30 00 45 00 20 75 have(soil) 25 85 00 25 85 comm(soil) 0 drive( , ) 10 drive( , ) 25 drive( , ) 0 at( ) 10 at( ) 25 drive( , ) 77
- Slides: 77