Hoe schaakt een computer Arnold Meijster Why study

  • Slides: 43
Download presentation
Hoe schaakt een computer? Arnold Meijster

Hoe schaakt een computer? Arnold Meijster

Why study games? n n n n Fun Historically major subject in AI Interesting

Why study games? n n n n Fun Historically major subject in AI Interesting subject of study because they are hard Games are of interest for AI researchers n n Solution is a strategy (strategy specifies move for every possible opponent reply). Time limits force an approximate solution Evaluation function: evaluate “goodness” of game position Examples: chess, checkers, othello/reversi,

Game setup n Two players: MAX and MIN n MAX moves first and they

Game setup n Two players: MAX and MIN n MAX moves first and they take turns until the game is over. n Another view: games are a search problem n n n Initial state: e. g. board configuration of chess Successor function: list of (move, state) pairs specifying legal moves. Goal/Terminal test: Is the game finished? Utility function: Gives numerical value of terminal states. E. g. win (+1), loose (-1) and draw (0) MAX uses a search tree to determine next move.

(Partial) Game Tree for Tic-Tac. Toe

(Partial) Game Tree for Tic-Tac. Toe

Optimal strategies n Assumption: Both players play optimally !! n Find the strategy for

Optimal strategies n Assumption: Both players play optimally !! n Find the strategy for MAX assuming an infallible MIN opponent. n Given a game tree, the optimal strategy can be determined by computing the minimax value of each node of the tree: MINIMAX-VALUE(n)= UTILITY(n) If n is a terminal maxs successors(n) MINIMAX-VALUE(s) If n is a max node mins successors(n) MINIMAX-VALUE(s) If n is a min node

Min. Max – First Example Max 5 n Max’s turn n Would like the

Min. Max – First Example Max 5 n Max’s turn n Would like the “ 9” points (the maximum) n But if Max chooses the left branch, Min will choose the move to get 3 n 3 Min 5 9 left branch has a value of 3 If Max chooses right, Min can choose any one of 5, 6 or 7 (will choose 5, the minimum) n 3 right branch has a value of 5 Right branch is largest (the maximum) so choose that move 5 6 7 Max

Min. Max – Second Example

Min. Max – Second Example

Tic-Tac-Toe: Three-Ply Game Tree

Tic-Tac-Toe: Three-Ply Game Tree

Min. Max – Pseudo Code int Max() { int best = -INFINITY; /* first

Min. Max – Pseudo Code int Max() { int best = -INFINITY; /* first move is best */ if (is. Terminal. State()) return Evaluate(); Generate. Legal. Moves(); while (Moves. Left()) { Make. Next. Move(); val = Min(); /* Min’s turn next */ Un. Make. Move(); if (val > best) best = val; } return best; }

Min. Max – Pseudo Code int Min() { int best = INFINITY; /* differs

Min. Max – Pseudo Code int Min() { int best = INFINITY; /* differs from MAX */ if (is. Terminal. State()) return Evaluate(); Generate. Legal. Moves(); while (Moves. Left()) { Make. Next. Move(); val = Max(); /* Max’s turn next */ Un. Make. Move(); if (val < best) // different than MAX best = val; } return best; }

Problem of minimax search n Number of game states explodes when the number of

Problem of minimax search n Number of game states explodes when the number of moves increases. n n Solution: Do not examine every node Idea: stop evaluating moves when you find a worse result than the previously examined moves. Does not benefit the player to play that move, it need not be evaluated any further. Save processing time without affecting

The α-β pruning algorithm n α is the value of the best (i. e.

The α-β pruning algorithm n α is the value of the best (i. e. , highest-value) choice found so far at any choice point along the path for max n If v is worse than α, max will avoid it n prune that branch Define β similarly for min

Min. Max – Alpha. Beta Pruning Example n From Max’s point of view, 1

Min. Max – Alpha. Beta Pruning Example n From Max’s point of view, 1 is already lower than 4 or 5, so no need to evaluate 2 and 3 (bottom right) Prune

Minimax: 2 -ply deep

Minimax: 2 -ply deep

α-β pruning example

α-β pruning example

α-β pruning example

α-β pruning example

α-β pruning example

α-β pruning example

α-β pruning example

α-β pruning example

α-β pruning example

α-β pruning example

Properties of α-β n Pruning does not affect final result (same as minimax) n

Properties of α-β n Pruning does not affect final result (same as minimax) n Good move ordering improves effectiveness of pruning n With "perfect ordering, " time complexity = O(bm/2) n Branching factor of sqrt(b) !! Alpha-beta pruning can look twice as far as minimax in the same amount of time n Chess: 4 -ply lookahead is a hopeless chess player! n n n 4 -ply ≈ human novice 8 -ply ≈ typical PC, human master 12 -ply ≈ Deep Blue, Kasparov Optimization: repeated states are possible. n Store them in memory = transposition table

Mini. Max and Chess n With a complete tree, we can determine the best

Mini. Max and Chess n With a complete tree, we can determine the best possible move n However, a complete tree is impossible for chess! n At a given time, chess has ~ 35 legal moves. n 35 at one ply, 352 = 1225 at two plies … 356 = 2 billion and 3510 = 2 quadrillion n n Games last 40 moves (or more), so 3540 For large games (like Chess) we can’t see the end of the game.

Games of imperfect information SHANNON (1950): n Cut off search earlier (replace TERMINAL-TEST by

Games of imperfect information SHANNON (1950): n Cut off search earlier (replace TERMINAL-TEST by CUTOFF-TEST) n Apply heuristic evaluation function EVAL (replacing utility function of alpha-beta)

Heuristic EVAL n Idea: produce an estimate of the expected utility of the game

Heuristic EVAL n Idea: produce an estimate of the expected utility of the game from a given position. n Performance depends on quality of EVAL. n Requirements: n n n Computation should not take too long. For non-terminal states the EVAL should be strongly correlated with the actual chance of winning. Only useful for quiescent (no wild swings in value in near future) states n Requires quiescence search

Evaluation functions n Typically a linear weighted sum of features Eval(s) = w 1

Evaluation functions n Typically a linear weighted sum of features Eval(s) = w 1 f 1(s) + w 2 f 2(s) + … + wn fn(s) n e. g. , w 1 = 10 with f 1(s) = (number of white queens) – (number of black queens), etc. Rule of thumb weight values for chess: Pawn=1 Bishop, Knight=3 Rook=5 Queen=10 King=9999

Heuristic difficulties Heuristic counts pieces won! Consider two cases: 1) Black to play 2)

Heuristic difficulties Heuristic counts pieces won! Consider two cases: 1) Black to play 2) White to play It really makes a difference when to apply the evaluation function!!!

Horizon effect A program with a fixed depth search less than 14 will think

Horizon effect A program with a fixed depth search less than 14 will think it can avoid the queening move

Horizon effect

Horizon effect

Mate in 19!

Mate in 19!

Mate in 40! Mate in 40 with Ke 7. Worse are Kc 5, Kc

Mate in 40! Mate in 40 with Ke 7. Worse are Kc 5, Kc 6, Kd 5, Kd 7, Ke 5 and Ke 6 which throw away the win. Source: http: //www. gilith. com/chess/endgames/kr_kn. html

Mate in 39! White mates in 39 after Ng 7. Worse are: Kg 4:

Mate in 39! White mates in 39 after Ng 7. Worse are: Kg 4: white mates in 16 Kg 5, Kg 6, Kh 4: white mates in 15 Kh 6: white mates in 13 Nc 7, Nd 6: white mates in 12 Nf 6: white mates in 10.

Mate (in 0)

Mate (in 0)

Mate (in 1)

Mate (in 1)

Mate (in 0)

Mate (in 0)

Mate (in 1)

Mate (in 1)

Previous states (black to move)

Previous states (black to move)

Nalimov dbase: backward search for all mate-in-n-positions do for all reverse moves m by

Nalimov dbase: backward search for all mate-in-n-positions do for all reverse moves m by black do if move m leads (forced) to mate in n then determine all mate in n+1 positions

Endgame databases…

Endgame databases…

How end game databases changed chess n All 5 piece endgames solved (can have

How end game databases changed chess n All 5 piece endgames solved (can have > 10^8 states) & many 6 piece n n Rule changes n n KRBKNN (~10^11 states): longest path-to-mate 223 Max number of moves from capture/pawn move to completion Chess knowledge n KRKN game was thought to be a draw, but n n White wins in 51% of WTM White wins in 87% of BTM

Summary n Games are fun n They illustrate several important points about AI n

Summary n Games are fun n They illustrate several important points about AI n Perfection is unattainable -> approximation n Uncertainty constrains the assignment of values to states n A computer’s strength at chess comes from: n How deep can it search n How well can it evaluate a board position n In some sense, like humans – a chess grandmaster can evaluate positions better and can look further ahead