Kriegspiel Stuart Russell and Jason Wolfe UC Berkeley

Kriegspiel Stuart Russell and Jason Wolfe UC Berkeley From Efficient belief-state AND–OR search, with application to Kriegspiel. IJCAI 2005 (in press). UCB 5/18/2005 1

The Real World… n Contains multiple agents with disparate goals ¨ Requires adversarial decision making n n Commonly studied through fully observable games, such as chess or backgammon Is partially observable Agents must make decisions, despite having incomplete knowledge about the state of their environment ¨ Agents should consider both their own information state (to gather information) and their opponent’s information state (to hide information) ¨ n n Optimal strategies are randomized Can thus be modeled by partially observable games (e. g. , poker) UCB 5/18/2005 2

Kriegspiel (“war-game”) n A partially observable variant of chess – opponent pieces invisible, moves secret ¨ ¨ n Referee observes actions, provides percepts Players attempt possibly-legal actions until one is legal Symmetric player percepts Move illegal; repeated identical illegal moves disallowed; attempts (except pawn captures) must be legal in absence of opponent pieces ¨ Capture occurred on <square> ¨ Check occurred in <directions> is one or more of Rank, File, Long Diagonal, Short Diagonal, or Knight (from king’s perspective) ¨ Checkmate and stalemate ¨ UCB 5/18/2005 3

Task/Metric n Task: given a Kriegspiel move and observation history for White, determine if White can guarantee a win within 3 -ply (actual moves) Can assume that Black has full observation, consider only deterministic strategies ¨ Two sub-problems ¨ n n n State estimation: determine belief state, set of possible board positions consistent with move and observation history Move selection: given the belief state, determine a move plan for White that guarantees a win within 3 -ply Metric: performance on a Kriegspiel checkmate problem database By playing two different Kriegspiel agents, generated database of 500 “mate instances” with guaranteed wins for White and 500 “near-miss instances” that “almost” have guaranteed wins for White. ¨ Measure accuracy in classifying database problem instances as mates/near-misses within some fixed time limit ¨ UCB 5/18/2005 4

Belief-State AND–OR Search n n Test for guaranteed checkmate by searching a tree whose nodes correspond to White’s belief states Common algorithms are depth -first search (DFS) and proofnumber search (PNS) ¨ Both treat a belief state as a “black box” UCB 5/18/2005 5

Algorithmic Contribution n Developed a new family of “incremental” belief-state AND–OR search algorithms that “look inside” the belief state Treat uncertainty as a new search dimension in addition to familiar depth and breadth ¨ G-DBU and IPNS (black and blue at right) are two incremental algorithms Mate Instance Solve Time ¨ n Near-Miss Instance Solve Time Both can be orders of magnitude faster than previous algorithms UCB 5/18/2005 6

New algorithms can solve 3 -ply database instances with 98% accuracy in 10 s. 100%@2. 66 s 98%@10 s UCB 5/18/2005 7

Demonstrations Demo 1: state estimation for a 3 -ply mate instance n Demo 2: move selection for this same problem instance n Demo 3: play against our Kriegspiel agent n UCB 5/18/2005 8