Automated Planning Planning The Planning problem Planning with

Automated Planning

Planning The Planning problem Planning with State-space search Partial-order planning Planning graphs Planning with propositional logic Analysis of planning approaches

What is Planning Generate sequences of actions to perform tasks and achieve objectives. – States, actions and goals Search for solution over abstract space of plans. Classical planning environment: fully observable, deterministic, finite, static and discrete. Assists humans in practical applications – – design and manufacturing military operations games space exploration

Difficulty of real world problems Assume a problem-solving agent using some search method … – Which actions are relevant? – Exhaustive search vs. backward search – What is a good heuristic functions? – Good estimate of the cost of the state? – Problem-dependent vs, -independent – How to decompose the problem? – Most real-world problems are nearly decomposable.

Planning language What is a good language? – Expressive enough to describe a wide variety of problems. – Restrictive enough to allow efficient algorithms to operate on it. – Planning algorithm should be able to take advantage of the logical structure of the problem. STRIPS and ADL

General language features Representation of states – Decompose the world in logical conditions and represent a state as a conjunction of positive literals. – Propositional literals: Poor Unknown – FO-literals (grounded and function-free): At(Plane 1, Melbourne) At(Plane 2, Sydney) – Closed world assumption Representation of goals – Partially specified state and represented as a conjunction of positive ground literals – A goal is satisfied if the state contains all literals in goal.

General language features Representations of actions – Action = PRECOND + EFFECT Action(Fly(p, from, to), PRECOND: At(p, from) Plane(p) Airport(from) Airport(to) EFFECT: ¬AT(p, from) At(p, to)) = action schema (p, from, to need to be instantiated) – Action name and parameter list – Precondition (conj. of function-free literals) – Effect (conj of function-free literals and P is True and not P is false) – Add-list vs delete-list in Effect

Language semantics? How do actions affect states? – An action is applicable in any state that satisfies the precondition. – For FO action schema applicability involves a substitution for the variables in the PRECOND. At(P 1, JFK) At(P 2, SFO) Plane(P 1) Plane(P 2) Airport(JFK) Airport(SFO) Satisfies : At(p, from) Plane(p) Airport(from) Airport(to) With ={p/P 1, from/JFK, to/SFO} Thus the action is applicable.

Language semantics? The result of executing action a in state s is the state s’ – s’ is same as s except – Any positive literal P in the effect of a is added to s’ – Any negative literal ¬P is removed from s’ EFFECT: ¬AT(p, from) At(p, to): At(P 1, SFO) At(P 2, SFO) Plane(P 1) Plane(P 2) Airport(JFK) Airport(SFO) – STRIPS assumption: (avoids representational frame problem) every literal NOT in the effect remains unchanged

Expressiveness and extensions STRIPS is simplified – Important limit: function-free literals – Allows for propositional representation – Function symbols lead to infinitely many states and actions Recent extension: Action Description language (ADL) Action(Fly(p: Plane, from: Airport, to: Airport), PRECOND: At(p, from) (from to) EFFECT: ¬At(p, from) At(p, to)) Standardization : Planning domain definition language (PDDL)

Example: air cargo transport Init(At(C 1, SFO) At(C 2, JFK) At(P 1, SFO) At(P 2, JFK) Cargo(C 1) Cargo(C 2) Plane(P 1) Plane(P 2) Airport(JFK) Airport(SFO)) Goal(At(C 1, JFK) At(C 2, SFO)) Action(Load(c, p, a) PRECOND: At(c, a) At(p, a) Cargo(c) Plane(p) Airport(a) EFFECT: ¬At(c, a) In(c, p)) Action(Unload(c, p, a) PRECOND: In(c, p) At(p, a) Cargo(c) Plane(p) Airport(a) EFFECT: At(c, a) ¬In(c, p)) Action(Fly(p, from, to) PRECOND: At(p, from) Plane(p) Airport(from) Airport(to) EFFECT: ¬ At(p, from) At(p, to)) [Load(C 1, P 1, SFO), Fly(P 1, SFO, JFK), Load(C 2, P 2, JFK), Fly(P 2, JFK, SFO)]

Example: Spare tire problem Init(At(Flat, Axle) At(Spare, trunk)) Goal(At(Spare, Axle)) Action(Remove(Spare, Trunk) PRECOND: At(Spare, Trunk) EFFECT: ¬At(Spare, Trunk) At(Spare, Ground)) Action(Remove(Flat, Axle) PRECOND: At(Flat, Axle) EFFECT: ¬At(Flat, Axle) At(Flat, Ground)) Action(Put. On(Spare, Axle) PRECOND: At(Spare, Groundp) ¬At(Flat, Axle) EFFECT: At(Spare, Axle) ¬At(Spare, Ground)) Action(Leave. Overnight PRECOND: EFFECT: ¬ At(Spare, Ground) ¬ At(Spare, Axle) ¬ At(Spare, trunk) ¬ At(Flat, Ground) ¬ At(Flat, Axle) ) This example goes beyond STRIPS: negative literal in pre-condition (ADL description)

Example: Blocks world Init(On(A, Table) On(B, Table) On(C, Table) Block(A) Block(B) Block(C) Clear(A) Clear(B) Clear(C)) Goal(On(A, B) On(B, C)) Action(Move(b, x, y) PRECOND: On(b, x) Clear(b) Clear(y) Block(b) (b x) (b y) (x y) EFFECT: On(b, y) Clear(x) ¬ On(b, x) ¬ Clear(y)) Action(Move. To. Table(b, x) PRECOND: On(b, x) Clear(b) Block(b) (b x) EFFECT: On(b, Table) Clear(x) ¬ On(b, x)) Spurious actions are possible: Move(B, C, C)

Planning with state-space search Both forward and backward search possible Progression planners – forward state-space search – Consider the effect of all possible actions in a given state Regression planners – backward state-space search – To achieve a goal, what must have been true in the previous state.

Progression and regression

Progression algorithm Formulation as state-space search problem: – Initial state = initial state of the planning problem – Literals not appearing are false – Actions = those whose preconditions are satisfied – Add positive effects, delete negative – Goal test = does the state satisfy the goal – Step cost = each action costs 1 No functions … any graph search that is complete is a complete planning algorithm. – E. g. A* Inefficient: – (1) irrelevant action problem – (2) good heuristic required for efficient search

Regression algorithm How to determine predecessors? – What are the states from which applying a given action leads to the goal? Goal state = At(C 1, B) At(C 2, B) … At(C 20, B) Relevant action for first conjunct: Unload(C 1, p, B) Works only if pre-conditions are satisfied. Previous state= In(C 1, p) At(p, B) At(C 2, B) … At(C 20, B) Subgoal At(C 1, B) should not be present in this state. Actions must not undo desired literals (consistent) Main advantage: only relevant actions are considered. – Often much lower branching factor than forward search.

Regression algorithm General process for predecessor construction – Give a goal description G – Let A be an action that is relevant and consistent – The predecessors is as follows: – Any positive effects of A that appear in G are deleted. – Each precondition literal of A is added , unless it already appears. Any standard search algorithm can be added to perform the search. Termination when predecessor satisfied by initial state. – In FO case, satisfaction might require a substitution.

Heuristics for state-space search Neither progression or regression are very efficient without a good heuristic. – How many actions are needed to achieve the goal? – Exact solution is NP hard, find a good estimate Two approaches to find admissible heuristic: – The optimal solution to the relaxed problem. – Remove all preconditions from actions – The subgoal independence assumption: The cost of solving a conjunction of subgoals is approximated by the sum of the costs of solving the subproblems independently.

Partial-order planning Progression and regression planning are totally ordered plan search forms. – They cannot take advantage of problem decomposition. – Decisions must be made on how to sequence actions on all the subproblems Least commitment strategy: – Delay choice during search

Shoe example Goal(Right. Shoe. On Left. Shoe. On) Init() Action(Right. Shoe, PRECOND: Right. Sock. On EFFECT: Right. Shoe. On) Action(Right. Sock, PRECOND: EFFECT: Right. Sock. On) Action(Left. Shoe, PRECOND: Left. Sock. On EFFECT: Left. Shoe. On) Action(Left. Sock, PRECOND: EFFECT: Left. Sock. On) Planner: combine two action sequences (1)leftsock, leftshoe (2)rightsock, rightshoe

Partial-order planning(POP) Any planning algorithm that can place two actions into a plan without which comes first is a PO plan.

POP as a search problem States are (mostly unfinished) plans. – The empty plan contains only start and finish actions. Each plan has 4 components: – A set of actions (steps of the plan) – A set of ordering constraints: A < B (A before B) – Cycles represent contradictions. – A set of causal links – The plan may not be extended by adding a new action C that conflicts with the causal link. (if the effect of C is ¬p and if C could come after A and before B) – A set of open preconditions. – If precondition is not achieved by action in the plan.

Example of final plan Actions={Rightsock, Rightshoe, Leftsock, Leftshoe, Start, Finish} Orderings={Rightsock < Rightshoe; Leftsock < Leftshoe} Links={Rightsock->Rightsockon -> Rightshoe, Leftsock->Leftsockon-> Leftshoe, Rightshoe->Rightshoeon->Finish, …} Open preconditions={}

POP as a search problem A plan is consistent iff there are no cycles in the ordering constraints and no conflicts with the causal links. A consistent plan with no open preconditions is a solution. A partial order plan is executed by repeatedly choosing any of the possible next actions. – This flexibility is a benefit in non-cooperative environments.

Solving POP Assume propositional planning problems: – The initial plan contains Start and Finish, the ordering constraint Start < Finish, no causal links, all the preconditions in Finish are open. – Successor function : – picks one open precondition p on an action B and – generates a successor plan for every possible consistent way of choosing action A that achieves p. – Test goal

Enforcing consistency When generating successor plan: – The causal link A->p->B and the ordering constraint A < B is added to the plan. – If A is new also add start < A and A < B to the plan – Resolve conflicts between new causal link and all existing actions – Resolve conflicts between action A (if new) and all existing causal links.

Process summary Operators on partial plans – Add link from existing plan to open precondition. – Add a step to fulfill an open condition. – Order one step w. r. t another to remove possible conflicts Gradually move from incomplete/vague plans to complete/correct plans Backtrack if an open condition is unachievable or if a conflict is irresolvable.

Example: Spare tire problem Init(At(Flat, Axle) At(Spare, trunk)) Goal(At(Spare, Axle)) Action(Remove(Spare, Trunk) PRECOND: At(Spare, Trunk) EFFECT: ¬At(Spare, Trunk) At(Spare, Ground)) Action(Remove(Flat, Axle) PRECOND: At(Flat, Axle) EFFECT: ¬At(Flat, Axle) At(Flat, Ground)) Action(Put. On(Spare, Axle) PRECOND: At(Spare, Groundp) ¬At(Flat, Axle) EFFECT: At(Spare, Axle) ¬Ar(Spare, Ground)) Action(Leave. Overnight PRECOND: EFFECT: ¬ At(Spare, Ground) ¬ At(Spare, Axle) ¬ At(Spare, trunk) ¬ At(Flat, Ground) ¬ At(Flat, Axle) )

Solving the problem Initial plan: Start with EFFECTS and Finish with PRECOND.

Solving the problem Initial plan: Start with EFFECTS and Finish with PRECOND. Pick an open precondition: At(Spare, Axle) Only Put. On(Spare, Axle) is applicable Add causal link: Add constraint : Put. On(Spare, Axle) < Finish

Solving the problem Pick an open precondition: At(Spare, Ground) Only Remove(Spare, Trunk) is applicable Add causal link: Add constraint : Remove(Spare, Trunk) < Put. On(Spare, Axle)

Solving the problem Pick an open precondition: ¬At(Flat, Axle) Leave. Over. Night is applicable conflict: Leave. Over. Night also has the effect ¬ At(Spare, Ground) To resolve, add constraint : Leave. Over. Night < Remove(Spare, Trunk)

Solving the problem Pick an open precondition: At(Spare, Ground) Leave. Over. Night is applicable conflict: To resolve, add constraint : Leave. Over. Night < Remove(Spare, Trunk) Add causal link:

Solving the problem Pick an open precondition: At(Spare, Trunk) Only Start is applicable Add causal link: Conflict: of causal link with effect At(Spare, Trunk) in Leave. Over. Night – No re-ordering solution possible. backtrack

Solving the problem Remove Leave. Over. Night, Remove(Spare, Trunk) and causal links Repeat step with Remove(Spare, Trunk) Add also Remove(Flat, Axle) and finish

Some details … What happens when a first-order representation that includes variables is used? – Complicates the process of detecting and resolving conflicts. – Can be resolved by introducing inequality constraint. CSP’s most-constrained-variable constraint can be used for planning algorithms to select a PRECOND.

Planning graphs Used to achieve better heuristic estimates. – A solution can also directly extracted using GRAPHPLAN. Consists of a sequence of levels that correspond to time steps in the plan. – Level 0 is the initial state. – Each level consists of a set of literals and a set of actions. – Literals = all those that could be true at that time step, depending upon the actions executed at the preceding time step. – Actions = all those actions that could have their preconditions satisfied at that time step, depending on which of the literals actually hold.

Planning graphs “Could”? – Records only a restricted subset of possible negative interactions among actions. They work only for propositional problems. Example: Init(Have(Cake)) Goal(Have(Cake) Eaten(Cake)) Action(Eat(Cake), PRECOND: Have(Cake) EFFECT: ¬Have(Cake) Eaten(Cake)) Action(Bake(Cake), PRECOND: ¬ Have(Cake) EFFECT: Have(Cake))

Cake example Start at level S 0 and determine action level A 0 and next level S 1. – A 0 >> all actions whose preconditions are satisfied in the previous level. – Connect precond and effect of actions S 0 --> S 1 – Inaction is represented by persistence actions. Level A 0 contains the actions that could occur – Conflicts between actions are represented by mutex links

Cake example Level S 1 contains all literals that could result from picking any subset of actions in A 0 – Conflicts between literals that can not occur together (as a consequence of the selection action) are represented by mutex links. – S 1 defines multiple states and the mutex links are the constraints that define this set of states. Continue until two consecutive levels are identical: leveled off – Or contain the same amount of literals (explanation follows later)

Cake example A mutex relation holds between two actions when: – – – Inconsistent effects: one action negates the effect of another. Interference: one of the effects of one action is the negation of a precondition of the other. Competing needs: one of the preconditions of one action is mutually exclusive with the precondition of the other. A mutex relation holds between two literals when (inconsistent support): – – If one is the negation of the other OR if each possible action pair that could achieve the literals is mutex.

PG and heuristic estimation PG’s provide information about the problem – A literal that does not appear in the final level of the graph cannot be achieved by any plan. – Useful for backward search (cost = inf). – Level of appearance can be used as cost estimate of achieving any goal literals = level cost. – Small problem: several actions can occur – Restrict to one action using serial PG (add mutex links between every pair of actions, except persistence actions). – Cost of a conjunction of goals? Max-level, sum-level and set -level heuristics. PG is a relaxed problem.

The GRAPHPLAN Algorithm How to extract a solution directly from the PG function GRAPHPLAN(problem) return solution or failure graph INITIAL-PLANNING-GRAPH(problem) goals GOALS[problem] loop do if goals all non-mutex in last level of graph then do solution EXTRACT-SOLUTION(graph, goals, LENGTH(graph)) if solution failure then return solution else if NO-SOLUTION-POSSIBLE(graph) then return failure graph EXPAND-GRAPH(graph, problem)

Example: Spare tire problem Init(At(Flat, Axle) At(Spare, trunk)) Goal(At(Spare, Axle)) Action(Remove(Spare, Trunk) PRECOND: At(Spare, Trunk) EFFECT: ¬At(Spare, Trunk) At(Spare, Ground)) Action(Remove(Flat, Axle) PRECOND: At(Flat, Axle) EFFECT: ¬At(Flat, Axle) At(Flat, Ground)) Action(Put. On(Spare, Axle) PRECOND: At(Spare, Groundp) ¬At(Flat, Axle) EFFECT: At(Spare, Axle) ¬At(Spare, Ground)) Action(Leave. Overnight PRECOND: EFFECT: ¬ At(Spare, Ground) ¬ At(Spare, Axle) ¬ At(Spare, trunk) ¬ At(Flat, Ground) ¬ At(Flat, Axle) ) This example goes beyond STRIPS: negative literal in pre-condition (ADL description)

GRAPHPLAN example Initially the plan consist of 5 literals from the initial state and the CWA literals (S 0). Add actions whose preconditions are satisfied by EXPAND-GRAPH (A 0) Also add persistence actions and mutex relations. Add the effects at level S 1 Repeat until goal is in level Si

GRAPHPLAN example EXPAND-GRAPH also looks for mutex relations – Inconsistent effects – E. g. Remove(Spare, Trunk) and Leave. Over. Night due to At(Spare, Ground) and not At(Spare, Ground) – Interference – E. g. Remove(Flat, Axle) and Leave. Over. Night At(Flat, Axle) as PRECOND and not At(Flat, Axle) as EFFECT – Competing needs – E. g. Put. On(Spare, Axle) and Remove(Flat, Axle) due to At(Flat. Axle) and not At(Flat, Axle) – Inconsistent support – E. g. in S 2, At(Spare, Axle) and At(Flat, Axle)

GRAPHPLAN example In S 2, the goal literals exist and are not mutex with any other – Solution might exist and EXTRACT-SOLUTION will try to find it EXTRACT-SOLUTION can use Boolean CSP to solve the problem or a search process: – – Initial state = last level of PG and goals of planning problem Actions = select any set of non-conflicting actions that cover the goals in the state Goal = reach level S 0 such that all goals are satisfied Cost = 1 for each action.

GRAPHPLAN example Termination? YES PG are monotonically increasing or decreasing: – – – Literals increase monotonically Actions increase monotonically Mutexes decrease monotonically Because of these properties and because there is a finite number of actions and literals, every PG will eventually level off !

Planning with propositional logic Planning can be done by proving theorem in situation calculus. Here: test the satisfiability of a logical sentence: Sentence contains propositions for every action occurrence. – A model will assign true to the actions that are part of the correct plan and false to the others – An assignment that corresponds to an incorrect plan will not be a model because of inconsistency with the assertion that the goal is true. – If the planning is unsolvable the sentence will be unsatisfiable.

SATPLAN algorithm function SATPLAN(problem, Tmax) return solution or failure inputs: problem, a planning problem Tmax, an upper limit to the plan length for T= 0 to Tmax do cnf, mapping TRANSLATE-TO_SAT(problem, T) assignment SAT-SOLVER(cnf) if assignment is not null then return EXTRACT-SOLUTION(assignment, mapping) return failure

cnf, mapping TRANSLATETO_SAT(problem, T) Distinct propositions for assertions about each time step. – Superscripts denote the time step At(P 1, SFO)0 At(P 2, JFK)0 – No CWA thus specify which propositions are not true ¬At(P 1, SFO)0 ¬At(P 2, JFK)0 – Unknown propositions are left unspecified. The goal is associated with a particular timestep – But which one?

cnf, mapping TRANSLATETO_SAT(problem, T) How to determine the time step where the goal will be reached? – Start at T=0 – Assert At(P 1, SFO)0 At(P 2, JFK)0 – Failure. . Try T=1 – … – Assert At(P 1, SFO)1 At(P 2, JFK)1 – Repeat this until some minimal path length is reached. – Termination is ensured by Tmax

cnf, mapping TRANSLATETO_SAT(problem, T) How to encode actions into PL? – Propositional versions of successor-state axioms At(P 1, JFK)1 (At(P 1, JFK)0 ¬(Fly(P 1, JFK, SFO)0 At(P 1, JFK)0)) (Fly(P 1, SFO, JFK)0 At(P 1, SFO)0) – Such an axiom is required for each plane, airport and time step – If more airports add another way to travel than additional disjuncts are required Once all these axioms are in place, the satisfiability algorithm can start to find a plan.

assignment SAT-SOLVER(cnf) Multiple models can be found They are NOT satisfactory: (for T=1) Fly(P 1, SFO, JFK)0 Fly(P 1, JFK, SFO)0 Fly(P 2, JFK. SFO)0 The second action is infeasible Yet the plan IS a model of the sentence Avoiding illegal actions: pre-condition axioms Fly(P 1, SFO, JFK)0 At(P 1, JFK) Exactly one model now satisfies all the axioms where the goal is achieved at T=1.

assignment SAT-SOLVER(cnf) A plane can fly at two destinations at once They are NOT satisfactory: (for T=1) Fly(P 1, SFO, JFK) 0 Fly(P 2, JFK, SFO) 0 Fly(P 2, JFK. LAX)0 The second action is infeasible Yet the plan allows spurious relations Avoid spurious solutions: action-exclusion axioms ¬(Fly(P 2, JFK, SFO)0 Fly(P 2, JFK, LAX)) Prevents simultaneous actions Lost of flexibility since plan becomes totally ordered : no actions are allowed to occur at the same time. – Restrict exclusion to preconditions

Analysis of planning approach Planning is an area of great interest within AI – Search for solution – Constructively prove a existence of solution Biggest problem is the combinatorial explosion in states. Efficient methods are under research – E. g. divide-and-conquer

Planning in the real world

Outline Time, schedules and resources Hierarchical task network planning Non-deterministic domains – Conditional planning – Execution monitoring and replanning – Continuous planning Multi-agent planning AI 1 Pag. 59

Time, schedules and resources Until know: – what actions to do Real-world: – + actions occur at certain moments in time. – + actions have a beginning and an end. – + actions take a certain amount of time. Job-shop scheduling: – Complete a set of jobs, each of which consists of a sequence of actions, – Where each action has a given duration and might require resources. – Determine a schedule that minimizes the total time required to complete all jobs (respecting resource constraints). AI 1 Pag. 60

Car construction example Init(Chassis(C 1) Chassis(C 2) Engine(E 1, C 1, 30) Engine(E 1, C 2, 60) Wheels(W 1, C 1, 30) Wheels(W 2, C 2, 15)) Goal(Done(C 1) Done(C 2)) Action(Add. Engine(e, c, m) PRECOND: Engine(e, c, d) Chassis(c) ¬Engine. In(c) EFFECT: Engine. In(c) Duration(d)) Action(Add. Wheels(w, c) PRECOND: Wheels(w, c, d) Chassis(c) EFFECT: Wheels. On(c) Duration(d)) Action(Inspect(c) PRECOND: Engine. In(c) Wheels. On(c) Chassis(c) EFFECT: Done(c) Duration(10)) AI 1 Pag. 61

Solution found by POP Slack of 15 critical path AI 1 Pag. 62

Planning vs. scheduling How does the problem differ from a standard planning problem? When does an action start and when does it end? – So next ot order (planning) duration is also considered Duration(d) Critical path method is used to determine start and end times: – Path = linear sequence from start to end – Critical path = path with longest total duration – Determines the duration of the entire plan – Critical path should be executed without delay AI 1 Pag. 63

ES and LS Earliest possible (ES) and latest possible (LS) start times. LS-ES = slack of an action for all actions determines the schedule for the entire problem. ES(Start) = 0 ES(B)=max. A<B ES(A) + Duration(A) LS(Finish)=ES(Finish) LS(A) = min. A<B LS(B) -Duration(A) Complexity is O(Nb) (given a PO) AI 1 Pag. 64

Scheduling with resources Resource constraints = required material or objects to perform task – Reusable resources – A resource that is occupied during an action but becomes available when the action is finished. – Require extension of action syntax: Resource: R(k) – k units of resource are required by the action. – Is a pre-requisite before the action can be performed. – Resource can not be used for k time units by other. AI 1 Pag. 65

Car example with resources Init(Chassis(C 1) Chassis(C 2) Engine(E 1, C 1, 30) Engine(E 1, C 2, 60) Wheels(W 1, C 1, 30) Wheels(W 2, C 2, 15) Engine. Hoists(1) Wheel. Stations(1) Inspectors(2)) Goal(Done(C 1) Done(C 2)) Action(Add. Engine(e, c, m) PRECOND: Engine(e, c, d) Chassis(c) ¬Engine. In(c) EFFECT: Engine. In(c) Duration(d), RESOURCE: Engine. Hoists(1)) Action(Add. Wheels(w, c) PRECOND: Wheels(w, c, d) Chassis(c) EFFECT: Wheels. On(c) Duration(d) RESOURCE: Wheel. Stations(1)) Action(Inspect(c) PRECOND: Engine. In(c) Wheels. On(c) Chassis(c) EFFECT: Done(c) Duration(10) RESOURCE: Inspectors(1)) aggregation Pag.

Car example with resources AI 1 Pag. 67

Scheduling with resources Aggregation = group individual objects into quantities when the objects are undistinguishable with respect to their purpose. – Reduces complexity Resource constraints make scheduling problems more complicated. – Additional interactions among actions Heuristic: minimum slack algorithm – Select an action with all pre-decessors scheduled and with the least slack for the earliest possible start. AI 1 Pag. 68

Hierarchical task network planning Reduce complexity hierarchical decomposition – At each level of the hierarchy a computational task is reduced to a small number of activities at the next lower level. – The computational cost of arranging these activities is low. Hierarchical task network (HTN) planning uses a refinement of actions through decomposition. – e. g. building a house = getting a permit + hiring a contractor + doing the construction + paying the contractor. – Refined until only primitive actions remain. Pure and hybrid HTN planning. AI 1 Pag. 69

Representation decomposition General descriptions are stored in plan library. – Each method = Decompos(a, d); a= action and d= PO plan. See buildhouse example Start action supplies all preconditions of actions not supplied by other actions. =external preconditions Finish action has all effects of actions not present in other actions =external effects – Primary effects (used to achieve goal) vs. secondary effects AI 1 Pag. 70

Buildhouse example External precond AI 1 Pag. 71 External effects

Buildhouse example Action(Buyland, PRECOND: Money, EFFECT: Land ¬Money) Action(Get. Loan, PRECOND: Goodcredit, EFFECT: Money Mortgage) Action(Build. House, PRECOND: Land, EFFECT: House) Action(Get. Permit, PRECOND: LAnd, EFFECT: Permit) Action(Hire. Builder, EFFECT: Contract) Action(Construction, PRECOND: Permit Contract, EFFECT: House. Built ¬Permit), Action(Pay. Builder, PRECOND: Money House. Built, EFFECT: ¬Money House ¬ ¬Contract), Decompose(Build. House, Plan : : STEPS{ S 1: Get. Permit, S 2: Hire. Builder, S 3: Construction, S 4 Pay. Builder} ORDERINGS: {Start < S 1 < S 3< S 4<Finish, Start<S 2<S 3}, LINKS AI 1 Pag. 72

Properties of decomposition Should be correct implementation of action a – Correct if plan d is complete and consistent PO plan for the problem of achieving the effects of a given the preconditions of a. A decomposition is not necessarily unique. Performs information hiding: – STRIPS action description of higher-level action hides some preconditions and effects – Ignore all internal effects of decomposition – Does not specify the intervals inside the activity during which preconditions and effects must hold. Information hiding is essential to HTN planning. AI 1 Pag. 73

Recapitulation of POP (1) Assume propositional planning problems: – The initial plan contains Start and Finish, the ordering constraint Start < Finish, no causal links, all the preconditions in Finish are open. – Successor function : – picks one open precondition p on an action B and – generates a successor plan for every possible consistent way of choosing action A that achieves p. – Test goal AI 1 Pag. 74

Recapitulation of POP (2) When generating successor plan: – The causal link A--p->B and the ordering constraing A < B is added to the plan. – If A is new also add start < A and A < B to the plan – Resolve conflicts between new causal link and all existing actions – Resolve conflicts between action A (if new) and all existing causal links. AI 1 Pag. 75

Adapting POP to HTN planning Remember POP? – Modify the successor function: apply decomposition to current plan NEW Successor function: – Select non-primitive action a’ in P – For any Decompose(a’, d’) method in library where a and a’ unify with substitution – Replace a’ with d’ = subst( , d) AI 1 Pag. 76

POP+HTN example a’ AI 1 Pag. 77

POP+HTN example a’ d AI 1 Pag. 78

How to hook up d in a’? Remove action a’ from P and replace with d – For each step s in d’ select an action that will play the role of s (either new s or existing s’ from P) – Possibility of subtask sharing Connect ordering steps for a’ to the steps in d’ – Put all constraints so that constraints of the form B < a’ are maintained. – Watch out for too strict orderings ! Connect the causal links – If B -p-> a’ is a causal link in P, replace it by a set of causal links from B to all steps in d’ with preconditions p that were supplied by the start step – Idem for a’ -p-> C AI 1 Pag. 79

What about HTN? Additional modification to POP are necessary BAD news: pure HTN planning is undecidable due to recursive decomposition actions. – Walk=make one step and walk Resolve problems by – Rule out recursion. – Bound the length of relevant solutions, – Hybridize HTN with POP Yet HTN can be efficient (see motivations in book) AI 1 Pag. 80

The Gift of magi AI 1 Pag. 81

Non-deterministic domains So far: fully observable, static and deterministic domains. – Agent can plan first and then execute plan with eyes closed Uncertain environment: incomplete (partially observable and/or nondeterministic) and incorrect (differences between world and model) information – Use percepts – Adapt plan when necessary Degree of uncertainty defined by indeterminacy – Bounded: actions can have unpredictable effects, yet can be listed in action description axioms. – Unbounded: preconditions and effects unknown or to large to enumerate. AI 1 Pag. 82

Handling indeterminacy Sensorless planning (conformant planning) – Find plan that achieves goal in all possible circumstances (regardless of initial state and action effects). Conditional planning (Contingency planning) – Construct conditional plan with different branches for possible contingencies. Execution monitoring and replanning – While constructing plan judge whether plan requires revision. Continuous planning – Planning active for a life time: adapt to changed circumstances and reformulate goals if necessary. AI 1 Pag. 83

Sensorless planning AI 1 Pag. 84

Abstract example Initial state = <chair, table, cans of paint, unknown colors>, goal state=<color(table) = color(chair)> Sensorless planning (conformant planning) – Open any can of paint and apply it to both chair and table. Conditional planning (Contingency planning) – Sense color of table and chair, if they are the same then finish else sense labels paint if color(label) =color(Furniture) then apply color to othe piece else apply color to both Execution monitoring and replanning – Same as conditional and can fix errors (missed spots) Continuous planning – Can revise goal when we want to first eat before painting the table and the chair. AI 1 Pag. 85

Conditional planning Deal with uncertainty by checking the environment to see what is really happening. Used in fully observable and nondeterministic environments: – The outcome of an action is unknown. – Conditional steps will check the state of the environment. – How to construct a conditional plan? AI 1 Pag. 86

Example, the vacuum-world AI 1 Pag. 87

Conditional planning Actions: left, right, suck Propositions to define states: At. L, At. R, Clean. L, Clean. R How to include indeterminism? – Actions can have more than one effect – E. g. moving left sometimes fails Action(Left, PRECOND: At. R, EFFECT: At. L) Becomes : Action(Left, PRECOND: At. R, EFFECT: At. L At. R) – Actions can have conditional effects Action(Left, PRECOND: At. R, EFFECT: At. L (At. L when clean. L: ¬clean. L) Both disjunctive and conditional AI 1 Pag. 88

Conditional planning Conditional plans require conditional steps: – If <test> then plan_A else plan_B if At. L Clean. L then Right else Suck – Plans become trees Games against nature: – Find conditional plans that work regardless of which action outcomes actually occur. – Assume vacuum-world Initial state = At. R Clean. L Clean. R Double murphy: possibility of desposit dirt when moving to other square and possibility of despositing dirt when action is Suck. AI 1 Pag. 89

Game tree State node chance node AI 1 Pag. 90

Solution of games against N. Solution is a subtree that – Has a goal node at every leaf – Specifies one action at each of its state nodes – Includes every outcome branch at each of the chance nodes. In previous example: [Left, if At. L Clean. R then [] else Suck] For exact solutions: use minimax algorithm with 2 modifications: – Max and Min nodes become OR and AND nodes – Algorithm returns conditional plan instead of single move AI 1 Pag. 91
![And-Or-search algorithm function AND-OR-GRAPH-SEARCH(problem) returns a conditional plan or failure return OR-SEARCH(INITIAL-STATE[problem], problem, []) And-Or-search algorithm function AND-OR-GRAPH-SEARCH(problem) returns a conditional plan or failure return OR-SEARCH(INITIAL-STATE[problem], problem, [])](http://slidetodoc.com/presentation_image_h2/7a1987b6ab77a96a04b33b4526a7b279/image-92.jpg)
And-Or-search algorithm function AND-OR-GRAPH-SEARCH(problem) returns a conditional plan or failure return OR-SEARCH(INITIAL-STATE[problem], problem, []) function OR-SEARCH(state, problem, path) returns a conditional plan or failure if GOAL-TEST[problem](state) then return the empty plan if state is on path then return failure for action, state_set in SUCCESSORS[problem](state) do plan AND-SEARCH(state_set, problem, [state | plan] ) if plan failure then return [action | plan] return failure function AND-SEARCH(state_set, problem, path) returns a conditional plan or failure for each si in state_set do plani OR-SEARCH(si, problem, path ) if plan = failure then return failure return [ if s 1 then plan 1 else if s 2 then plan 2 else … if sn-1 then plann-1 else plann] AI 1 Pag. 92

And-Or-search algorithm How does it deal with cycles? – When a state that already is on the path appears, return failure – No non-cyclic solution – Ensures algorithm termination – The algorithm does not check whether some state is already on some other path from the root. AI 1 Pag. 93

And-Or-search algorithm Sometimes only a cyclic solution exists – e. g. tripple murphy: sometimes the move is not performed [Left, if Clean. L then [] else Suck] is not a solution – Use label to repeat parts of plan (but infinite loops) [L 1: Left, if At. R then L 1 else if Clean. L then [] else Suck] AI 1 Pag. 94

CP and partially observable env. Fully observable: conditional tests can ask any question and get an answer Partially observable? ? ? – The agent has limited information about the environment. – Modeled by a state-set = belief states – E. g. assume vacuum agent which can not sense presence or absence of dirt in other squares than the one it is on. – + alternative murphy: dirt can be left behind when moving to other square. – Solution in fully observable world: keep moving left and right, sucking dirt whenever it appears until both squares are clean and I’m in square left. AI 1 Pag. 95

PO: alternate double murphy AI 1 Pag. 96

Belief states Representation? – Sets of full state descriptions {(At. R Clean. L) (At. R Clean. L)} – Logical sentences that capture the set of possible worlds in the belief state (OWA) At. R Clean. R – Knowledge propositions describing the agent’s knowledge (CWA) K(At. R) K(Clean. R) AI 1 Pag. 97

Belief states Choice 2 and 3 are equivalent (let’s continue with 3) Symbols can appear in three ways: positive, negative or unknown: 3 n possible belief states for n proposition symbols. – YET, set of belief sets is a power set of the phyiscal states which is much larger than 3 n – Hence 3 is restricted as representation Any scheme capable of representing every possible belief state will require O(2 n) bit to represent each one in the worst case. The current scheme only requires O(n) AI 1 Pag. 98

Sensing in Cond. Planning How does it work? – Automatic sensing At every time step the agent gets all available percepts – Active sensing Percepts are obtained through the execution of specific sensory actions. check. Dirt and check. Location Given the representation and the sensing, action descriptions can now be formulated. AI 1 Pag. 99

Monitoring and replanning Execution monitoring: check whether everything is going as planned. – Unbounded indeterminancy: some unanticipated circumstances will arise. – A necessity in realistic environments. Kinds of monitoring: – Action monitoring: verify whether the next action will work. – Plan monitoring: verify the entire remaining plan. AI 1 Pag. 100

Monitoring and replanning When something unexpected happens: replan – To avoid too much time on planning try to repair the old plan. Can be applied in both fully and partially observable environments, and to a variety of planning representations. AI 1 Pag. 101

Replanning-agent function REPLANNING-AGENT(percept) returns an action static: KB, a knowledge base (+ action descriptions) plan, a plan initially [] whole_plan, a plan initially [] goal, a goal TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t)) current STATE-DESCRIPTION(KB, t) if plan = [] then return the empty plan whole_plan PLANNER(current, goal, KB) if PRECONDITIONS(FIRST(plan)) not currently true in KB then candidates SORT(whole_plan, ordered by distance to current) find state s in candidates such that failure repair PLANNER(current, s, KB) continuation the tail of whole_plan starting at s whole_plan APPEND(repair, continuation) return POP(plan) AI 1 Pag. 102

Repair example AI 1 Pag. 103

Repair example: painting Init(Color(Chair, Blue) Color(Table, Green) Contains. Color(BC, Blue) Paint. Can(BC) Contains. Color(RC, Red) Paint. Can(RC)) Goal(Color(Chair, x) Color(Table, x)) Action(Paint(object, color) PRECOND: Have. Paint(color) EFFECT: Color(object, color)) Action(Open(can) PRECOND: Paint. Can(can) Contains. Color(can, color) EFFECT: Have. Paint(color)) [Start; Open(BC); Paint(Table, Blue), Finish] AI 1 Pag. 104

Repair example: painting Suppose that the agent now perceives that the colors of table and chair are different – Figure out point in whole_plan to aim for Current state is identical as the precondition before Paint – Repair action sequence to get there. Repair =[] and plan=[Paint, Finish] – Continue performing this new plan Will loop until table and chair are perceived as the same. Action monitoring can lead to less intelligent behavior – Assume the red is selected and there is not enough paint to apply to both chair and table. – Improved by doing plan monitoring AI 1 Pag. 105

Plan monitoring Check the preconditions for success of the entire plan. – Except those which are achieved by another step in the plan. – Execution of doomed plan is cut of earlier. Limitation of replanning agent: – It can not formulate new goals or accept new goals in addition to the current one AI 1 Pag. 106

Continuous planning. Agent persists indefinitely in an environment – Phases of goal formulation, planning and acting Execution monitoring + planner as one continuous process Example: Blocks world – Assume a fully observable environment – Assume partially ordered plan AI 1 Pag. 107

Block world example Initial state (a) Action(Move(x, y), PRECOND: Clear(x) Clear(y) On(x, z) EFFECT: On(x, y) Clear(z) On(x, z) Clear(y) The agent first need to formulate a goal: On(C, D) On(D, B) Plan is created incrementally, return No. Op and check percepts AI 1 Pag. 108

Block world example Assume that percepts don’t change and this plan is constructed Ordering constraint between Move(D, B) and Move(C, D) Start is label of current state during planning. Before the agent can execute the plan, nature intervenes: D is moved onto B AI 1 Pag. 109

Block world example Start contains now On(D, B) Agent perceives: Clear(B) and On(D, G) are no longer true – Update model of current state (start) Causal links from Start to Move(D, B) (Clear(B) and On(D, G)) no longer valid. Remove causal relations and two PRECOND of Move(D, B) are open Replace action and causal links to Finish by connecting Start to Finish. AI 1 Pag. 110

Block world example Extending causal link Extending: whenever a causal link can be supplied by a previous step All redundant steps (Move(D, B) and its causal links) are removed from the plan Execute new plan, perform action Move(C, D) – This removes the step from the plan AI 1 Pag. 111

Block world example Execute new plan, perform action Move(C, D) – Assume agent is clumsy and drops C on A No plan but still an open PRECOND Determine new plan for open condition Again Move(C, D) AI 1 Pag. 112

Block world example Similar to POP On each iteration find plan-flaw and fix it Possible flaws: Missing goal, Open precondition, Causal conflict, Unsupported link, Redundant action, Unexecuted action, unnecessary historical goal AI 1 Pag. 113

Multi-agent planning So far we only discussed single-agent environments. Other agents can simply be added to the model of the world: – Poor performance since agents are not indifferent ot other agents’ intentions In general two types of multi-agent environments: – Cooperative – Competitive AI 1 Pag. 114

Cooperation: Joint goals and plans Multi-planning problem: assume double tennis example where agents want to return ball. Agents(A, B) Init(At(A, [Left, Baseline]) At(B, [Right, Net]) Approaching(Ball, [Right, Baseline]) PArtner(A, B) Partner(B, A)) Goal(Returned(Ball) At(agent, [x, Net])) Action(Hit(agent, Ball) PRECOND: Approaching(Ball, [x, y]) At(agent, [x, y]) Partner(agent, partner) At(partner, [x, y]) EFFECT: Returned(Ball)) Action(Go(agent, [x, y]) PRECOND: At(agent, [a, b]) EFFECT: At(agent, [x, y]) At(agent, [a, b])) AI 1 Pag. 115

Cooperation: Joint goals and plans A solution is a joint-plan consisting of actions for both agents. Example: A: [Go(A, [Right, Baseline]), Hit(A, Ball)] B: [No. Op(B), No. Op(B)] Or A: [Go(A, [Left, net), No. Op(A)] B: [Go(B, [Right, Baseline]), Hit(B, Ball)] Coordination is required to reach same joint plan AI 1 Pag. 116

Multi-body planning Planning problem faced by a single centralized agent that can dictate action to each of several physical entities. Hence not truly multi-agent Important: synchronization of actions – Assume for simplicity that every action takes one time step and at each point in the joint plan the actions are performed simultaneously [<Go(A, [Left, Net]), Go(B, [Right, Baseline]>; <No. Op(A), Hit(B, Ball)>] – Planning can be performed using POP applied to the set of all possible joint actions. – Size of this set? ? ? AI 1 Pag. 117

Multi-body planning Alternative to set of all joint actions: add extra concurrency lines to action description – Concurrent action Action(Hit(A, Ball) CONCURRENT: Hit(B, Ball) PRECOND: Approaching(Ball, [x, y]) At(A, [x, y]) EFFECT: Returned(Ball)) – Required actions (carrying object by two agents) Action(Carry(A, cooler, here, there) CONCURRENT: Carry(B, cooler, here there) PRECOND: …) Planner similar to POP with some small changes in possible ordering relations AI 1 Pag. 118

Coordination mechanisms To ensure agreement on joint plan: use convention. – Convention = a constraint on the selection of joint plans (beyond the constraint that the joint plan must work if the agents adopt it). e. g. stick to your court or one player stays at the net. Conventions which are widely adopted= social laws e. g. language. Can be domain-specific or independent. Could arise through evolutionary process (flocking behavior). AI 1 Pag. 119

Flocking example Three rules: – Separation: Steer away from neighbors when you get too close – Cohesion Steer toward the average position of neighbors – Alignment Steer toward average orientation (heading) of neighbors Flock exhibits emergent behavior of flying as a pseudo-rigid body. AI 1 Pag. 120

Coordination mechanisms In the absence of conventions: Communication e. g. Mine! Or Yours! in tennis example The burden of arriving at a succesfull joint plan can be placed on – Agent designer (agents are reactive, no explicit models of other agents) – Agent (agents are deliberative, model of other agents required) AI 1 Pag. 121

Competitive environments Agents can have conflicting utilities e. g. zero-sum games like chess The agent must: – – Recognize that there are other agents Compute some of the other agents plans Compute how the other agents interact with its own plan Decide on the best action in view of these interactions. Model of other agent is required YET, no commitment to joint action plan. AI 1 Pag. 122
- Slides: 122