Game Playing MiniMax search AlphaBeta pruning General concerns

Game Playing Mini-Max search Alpha-Beta pruning General concerns on games

Why study board games ? ¤ One of the oldest subfields of AI (Shannon and Turing, 1950) è Abstract and pure form of competition that seems to require intelligence è Easy to represent the states and actions è Very little world knowledge required ! ¤ Game playing is a special case of a search problem, with some new requirements. 2

Types of games Deterministic Chance Perfect information Chess, checkers, Backgammon, go, othello monopoly Imperfect information Sea battle Bridge, poker, scrabble, nuclear war 3

Why new techniques for games? ¤ “Contingency” problem: è We don’t know the opponents move ! ¤ The size of the search space: èChess : ~15 moves possible per state, 80 ply u 1580 nodes in tree èGo : ~200 moves per state, 300 ply u 200300 nodes in tree ¤ Game playing algorithms: è Search tree only up to some depth bound è Use an evaluation function at the depth bound è Propagate the evaluation upwards in the tree 4

MINI MAX ¤ Restrictions: è 2 players: MAX (computer) and MIN (opponent) è deterministic, perfect information ¤ Select a depth-bound (say: 2) and evaluation function MAX MIN MAX Select this move 3 2 2 3 1 5 3 1 4 4 3 - Construct the tree up till the depth-bound - Compute the evaluation function for the leaves - Propagate the evaluation function upwards: - taking minima in MIN - taking maxima in MAX 5

The MINI-MAX algorithm: Initialise depthbound; Minimax (board, depth) = IF depth = depthbound THEN return static_evaluation(board); ELSE IF maximizing_level(depth) THEN FOR EACH child of board compute Minimax(child, depth+1); return maximum over all children; ELSE IF minimizing_level(depth) THEN FOR EACH child of board compute Minimax(child, depth+1); return minimum over all children; Call: Minimax(current_board, 0) 6

Alpha-Beta Cut-off ¤ Generally applied optimization on Mini-max. ¤ Instead of: è first creating the entire tree (up to depth-level) è then doing all propagation ¤ Interleave the generation of the tree and the propagation of values. ¤ Point: è some of the obtained values in the tree will provide information that other (non-generated) parts are redundant and do not need to be generated. 7

Alpha-Beta idea: ¤ Principles: è generate the tree depth-first, left-to-right è propagate final values of nodes as initial estimates for their parent node. MAX MIN MAX 2 2 =2 1 - The MIN-value (1) is already smaller than the MAX-value of the parent (2) - The MIN-value can only decrease further, - The MAX-value is only allowed to increase, - No point in computing further below this node 2 5 1 8

Terminology: - The (temporary) values at MAX-nodes are ALPHA-values - The (temporary) values at MIN-nodes are BETA-values MAX MIN MAX 2 2 2 Alpha-value =2 1 5 1 Beta-value 9

The Alpha-Beta principles (1): - If an ALPHA-value is larger or equal than the Beta-value of a descendant node: stop generation of the children of the descendant MAX 2 Alpha-value MIN MAX 2 2 =2 1 5 1 Beta-value 10

The Alpha-Beta principles (2): - If an Beta-value is smaller or equal than the Alpha-value of a descendant node: stop generation of the children of the descendant MAX 2 =2 Beta-value 1 MIN 2 MAX 2 5 3 Alpha-value 11

Mini-Max with at work: 4 16 8 6 5 23 = 4 15 8 2 =8 5 9 8 = 5 30 2 10 1 18 4 12 3 20 = 4 14 = 5 22 8 7 3 9 1 6 2 4 1 1 3 4 7 5 31 = 5 39 9 11 13 MAX 3 38 1 33 2 35 3 25 9 27 6 29 = 3 37 1 3 5 3 9 2 6 5 2 17 19 21 24 26 MIN 28 MAX 1 2 3 9 7 2 8 6 4 32 34 36 11 static evaluations saved !! 12

“DEEP” cut-offs - For game trees with at least 4 Min/Max layers: the Alpha - Beta rules apply also to deeper levels. 4 4 4 2 2 13

The Gain: Best case: - If at every layer: the best node is the left-most one MAX MIN MAX Only THICK is explored 14

Example of a perfectly ordered tree MAX 21 21 21 24 12 27 21 20 19 24 23 22 27 26 25 12 15 3 18 12 11 10 15 14 13 18 17 16 3 6 MIN 9 MAX 3 2 1 6 5 4 9 8 7 15

How much gain ? - Alpha / Beta : best case : # (static evaluations) = 2 bd/2 - 1 (if d is even) b(d+1)/2 + b(d-1)/2 - 1 (if d is odd) - The proof is by induction. - In the running example: d=3, b=3 : 11 ! 16

Best case gain pictured: # Static evaluations b = 10 No pruning 100000 Alpha-Beta Best case 10000 100 10 1 2 3 4 5 6 7 Depth - Note: a logarithmic scale. - Conclusion: still exponential growth !! - Worst case? ? For some trees alpha-beta does nothing, For some trees: impossible to reorder to avoid cut-offs 17

The horizon effect. Queen lost Pawn lost horizon = depth bound of mini-max Queen lost Because of the depth-bound we prefer to delay disasters, although we don’t prevent them !! è solution: heuristic continuations 18

Time bounds: How to play within reasonable time bounds? Even with fixed depth-bound, times can vary strongly! Solution: Iterative Deepening !!! Remember: overhead of previous searches = 1/b Good investment to be sure to have a move ready. 19

Games of chance Ex. : Backgammon: Form of the game tree: 20

State of the art Drawn from an article by Mathew Ginsberg, Scientific American, Winter 1998, Special Issue on Exploring Intelligence 21

State of the art (2) 22

State of the art (3) 23

Win of deep blue predicted: Computer chess ratings studied around 90 ies: Chess Rating 3500 3000 ? Kasparov 2500 2000 1500 2 4 6 8 10 12 14 Depth in ply Further increase of depth was likely to win ! 24