Hierarchical POMDP Planning and Execution Joelle Pineau Machine
- Slides: 15
Hierarchical POMDP Planning and Execution Joelle Pineau Machine Learning Lunch November 20, 2000
Partially Observable MDP u POMDPs are characterized by: States: s S l Actions: a A l Observations: o O l Transition probabilities: T(s, a, s’)=Pr(s’|s, a) l Observation probabilities: T(o, a, s’)=Pr(o|s, a) l Rewards: R(s, a) l l Beliefs: b(st)=Pr(st|ot, at, …, o 0, a 0) Hierarchical POMDP Planning and Execution S 1 S 2 S 3 2
The problem u u How can we find good policies for complex POMDPs? Is there a principled way to provide near-optimal policies? Hierarchical POMDP Planning and Execution 3
Proposed Approach u Exploit structure in the problem domain. u What type of structure? l Action set partitioning Act Investigate. Health Move Check. Pulse Check. Meds Navigate Ask. Where Left Right Up Down Hierarchical POMDP Planning and Execution 4
Hierarchical POMDP Planning u What do we start with? A full POMDP model: {So, Ao, Oo, Mo}. l An action set partitioning graph. l u Key idea: Break the problem into many “related” POMDPs. l Each smaller POMDP has only a subset of Ao. Þ imposing policy constraint l u But why? l POMDP: exponential run-time per value iteration O(|A| n-1|O|) Hierarchical POMDP Planning and Execution 5
Example POMDP: Value Function: 0. 8 M Kitchen. State Meds. State 0. 1 Check. Meds K 0. 8 Go. To. Kitchen Go. To. Bedroom 0. 1 E 0. 1 Bedroom. State 0. 1 B 0. 1 0. 8 Clarify. Task So= {Meds, Kitchen, Bedroom} Ao = {Clarify. Task, Check. Meds, Go. To. Kitchen, Go. To. Bedroom} Oo = {Noise, Meds, Kitchen, Bedroom} Hierarchical POMDP Planning and Execution 6
Hierarchical POMDP Action Partitioning: Act Clarify. Task Move Go. To. Kitchen Hierarchical POMDP Planning and Execution Check. Meds Go. To. Bedroom 7
Local Value Function and Policy - Move Controller Kitchen. State Meds. State Go. To. Kitchen Bedroom. State Go. To. Bedroom Clarify. Task Hierarchical POMDP Planning and Execution 8
Modeling Abstract Actions Problem: Need parameters for abstract action Move Solution: Use the local policy of corresponding low-level controller General form: Pr ( sj | si, akabstract ) = Pr ( sj | si, Policy(akabstract, si) ) Example: Pr ( sj | Meds. State, Move ) = Pr ( sj | Meds. State, Clarify. Task ) Policy (Move, si): Kitchen. State Meds. State Bedroom. State Go. To. Kitchen Go. To. Bedroom Clarify. Task 9
Local Value Function and Policy - Act Controller Kitchen. State Meds. State Bedroom. State Check. Meds Move Hierarchical POMDP Planning and Execution 10
Comparing Policies Hierarchical Policy: = Clarify. Task = Check. Meds Hierarchical POMDP Planning and Execution Optimal Policy: = Go. To. Kitchen = Go. To. Bedroom 11
Bounding the value of the approximation u Value function of top-level controller is an upperbound on the value of the approximation. l u Why? We were optimistic when modeling the abstract action. Similarly, we can find a lower-bound. l How? We can assume “worst-case” view when modeling the abstract action. èIf we partition the action set differently, we will get different bounds. Hierarchical POMDP Planning and Execution 12
A real dialogue management example - Say. Time Act Check. Health Greet Move Check. Weather Phone - Greet. General - Greet. Morning - Greet. Night - Respond. Thanks - Ask. Go. Where - Go. To. Room - Go. To. Kitchen - Go. To. Follow - Verify. Room - Verify. Kitchen - Verify. Follow - Ask. Weather. Time - Say. Current - Say. Today - Say. Tomorrow Hierarchical POMDP Planning and Execution - Ask. Health - Offer. Help Do. Meds - Start. Meds - Next. Meds - Force. Meds - Quit. Meds - Ask. Call. Who - Call. Help - Call. Nurse - Call. Relative - Verify. Help - Verify. Nurse - Verify. Relative 13
Results: Hierarchical POMDP Planning and Execution 14
Final words u We presented: l u a general framework to exploit structure in POMDPs; Future work: automatic generation of good action partitioning; l conditions for additional observation abstraction; l bigger problems! l Hierarchical POMDP Planning and Execution 15
- Joelle gehring
- Open living lab days
- Regulacion artificial pineau
- Bill giesler
- Marion pineau judo
- Lionel pineau
- Ppbe
- Characteristics of virtualization environment
- Machine reference model of execution virtualization
- Strategic planning hierarchy
- Give an example of hierarchical planning in an organization
- Demand forecasting introduction
- Hierarchical task network planning
- Monitoring in voyage planning
- Tujuan inisiasi proyek
- Planning without execution is hallucination