Game Trees ctd 15 211 Fundamental Data Structures

In this lecture. . § Computer chess § Backtracking § Game Trees ØMinimax and

Chiptest, Deep Thought Timeline Story continues @ IBM § A VLSI design class project

Opinions: Is Computer Chess AI? From Hans Moravec’s Book “Robot”

How does Kasparov won? § Even the best Chess grandmasters say they only look

Programming Chess § There are 10120 possible chess boards (Shannon, 1949) § Early 70

Hash Tables § Remember hash value of positions as different sequences of moves can

Move Ordering § Alpha-beta algorithm correctly evaluates a position as mini-max § The performance

Iterative Deepening § Depth can be a variable (based on available time) § Starting

Tricks § Many tricks and heuristics have been added to chess program, including this

Backtracking § An algorithm-design technique § “Organized brute force” Ø Enumerate the space of

Basic backtracking § Develop answers by identifying a set of decisions. ØKnights tour §

Basic backtracking § Develop answers by identifying a set of decisions. § Pure backtracking

8 -Queen Problem § Arrange 8 -queens on a 8 x 8 chess board

Backtracking technique § When you reach a dead-end ØWithdraw the most recent choice ØUndo

No optimality guarantees § If there are multiple solutions, simple backtracking will find one

Backtracking Summary § Organized brute force § Formulate problem so that answer amounts to

Decision trees: Planning § Nodes of the tree ØDenote states in the problem space

Mini-Max Algorithm § The Mini-Max algorithm is applied in two player games Ø tic-tac-toe,

Min-Max Algorithm § Each terminal position has some value Ø Terminal position is a

Nega-Max - pseudo code Nega-max(P){ if P is terminal, return v(P) m = -

How fast is mini-max? § Minimax is pretty slow even for a modest depth.

Quiz Time § It is your turn. You have n possible moves you can

A Game § Consider the following two player game: Ø The game starts with

Problem ctd… § Case 1: Ø Suppose you are allowed to take only one

Alpha-Beta Pruning Reducing search complexity

Alpha Beta speedup § Alpha Beta is always a win: Ø Returns the same

Alpha Beta Pruning Theorem: Let v(P) be the value of a position P. Let

Aspiration Window: § Suppose you had information that value of a position was probably

Heuristic search techniques Heuristic = aid to problem-solving § Alpha-beta is one way to

Move Ordering § Explore decisions in order of likely success to get early alpha

Timed moves § Not uncommon to have a limited time to make a move.

Iterative Deepening § Evaluate moves to successively deeper and deeper depths: Ø Start with

Iterative Deepening § Save best moves of shallower searches to control move ordering. §

Transposition Tables § Minimax and Alpha Beta (implicitly) build a tree. But what is

Transposition Tables § Memoize: hash table of board positions to get Ø Value for

Limited Depth Search Problems § Horizon effect: if the bad news over the search

Game of Nim § Arbitrary number of piles. Players alternate. A player can take

Drawing the Game Tree The first thing we need to take care of is

Nim Game Tree (1, 2, 3) (2, 2) (1, 1, 3) (2, 1) (3)

Nim Game Graph (1, 2, 3) (2, 2) (1, 2) (3) (1, 1, 3)

Game of Amazons § Invented in 1988 by Argentinian Walter Zamkauskas. § Several programming

Amazons Board § Chess board 10 x 10 § 4 White and 4 Black

Amazons Rules § Each move consists of two steps: 1. Move an amazon of

Amazons challenges § Absence of opening theory § Branching factor of more than 1000

AMAZONG § World’s best computer player by Jens Lieberum § Distinguishes three phases of

Amazons Opening § Main goals in the opening ØObtain a good distribution of amazons

Slides: 55

Download presentation

Game Trees ctd. . 15 -211 Fundamental Data Structures and Algorithms Ananda Guna April 27, 2006

In this lecture. . § Computer chess § Backtracking § Game Trees ØMinimax and Nega. Max ØAlpha-beta pruning ØIterative Deepening § Nim and Amazons X O X O O

Computer Chess

Chiptest, Deep Thought Timeline Story continues @ IBM § A VLSI design class project evolves into F. H. Hsu move-generator chip § A productive CS-TG results in Chip. Test, about 6 weeks before the ACM computer chess championship § Chiptest participates and plays interesting (illegal) moves § Chiptest-M wins ACM CCC § Redesign becomes DT § DT participates in human chess championships (in addition to CCC) § DT wins second Fredkin Prize ($100 K) § DT is wiped out by Kasparov

Opinions: Is Computer Chess AI? From Hans Moravec’s Book “Robot”

How does Kasparov won? § Even the best Chess grandmasters say they only look 4 or 5 moves ahead each turn. Deep Junior looks up about 18 -25 moves ahead. How does it lose!? § Kasparov has an unbelievable evaluation function. He is able to assess strategic advantages much better than programs can (although this is getting less true). § The moral, the evaluation function plays a large role in how well your program can play.

Programming Chess § There are 10120 possible chess boards (Shannon, 1949) § Early 70 s Atkin and Slate wrote chess 4. 5 Ø Read "Chess Skill in Man and Machine", which is in the library § They developed the basic techniques that is the foundation for most chess engines Ø Hash Tables Ø Move ordering Ø Iterative Deepening Ø Etc. .

Hash Tables § Remember hash value of positions as different sequences of moves can lead to the same position § What to keep in the table ØMay be obvious from context which positions are more useful ØKeep the position in which more search time has been invested § alpha-beta and hashing is a little tricky

Move Ordering § Alpha-beta algorithm correctly evaluates a position as mini-max § The performance of Alpha-Beta is affected by other factors Ø The order in which the moves are tried can have an enormous impact on the efficiency of the search § Estimates of the quality of a move may sometimes be available § Heuristic estimate of the cost of searching one move versus another may be available

Iterative Deepening § Depth can be a variable (based on available time) § Starting from current position, evaluate to successively deeper and deeper depths § Use the results of shallower evaluations to control the move ordering Ø Benefit of having a good move ordering makes up for the expense of doing all the shallower searches Ø Another advantage of iterative deepening is that if the search at a depth D is taking too much time, then the program can stop and fall back on the shallower evaluation

Tricks § Many tricks and heuristics have been added to chess program, including this tiny subset: Ø Opening Books Ø Avoiding mistakes from earlier games Ø Endgame databases (Ken Thompson) Ø Singular Extensions Ø Think ahead Ø Contempt factor Ø Strategic time control

X O X O O A B C Backtracking D

Backtracking § An algorithm-design technique § “Organized brute force” Ø Enumerate the space of possible answers Ø Do this in an organized way § Useful… Ø … when a problem is too hard to be solved directly § Maze traversal § 8 -queens problem Ø … when we have limited time and can accept a good but potentially not optimal solution § Game playing § Planning

Basic backtracking § Develop answers by identifying a set of decisions. ØKnights tour § if the chess piece the knight could make a tour around the board, thereby visiting every square once and just once § On a 6 x 6 board there are 9, 862 different closed tours § Tried 4, 056, 367, 434 moves

Basic backtracking § Develop answers by identifying a set of decisions. § Pure backtracking Ø Decisions can be ok or impossible. § Heuristic backtracking Ø Decisions can have goodness. They can also be impossible. § Bad to sacrifice the queen to take a pawn § Bad not to block in tic-tac-toe § Good to pack a large object into the box

8 -Queen Problem § Arrange 8 -queens on a 8 x 8 chess board such that no queen can take another 4 -Queen problem ……. . ……….

Backtracking technique § When you reach a dead-end ØWithdraw the most recent choice ØUndo its consequences ØIs there a new choice? • If so, try that • If not, you are at another A • dead-end B C

No optimality guarantees § If there are multiple solutions, simple backtracking will find one of them, but not necessarily the optimal. § No guarantees that we’ll reach a solution quickly.

Backtracking Summary § Organized brute force § Formulate problem so that answer amounts to taking a set of successive decisions § When stuck, go back, try the remaining alternatives § Does not give optimal solutions

Game Trees

Decision trees: Planning § Nodes of the tree ØDenote states in the problem space § The path from leaf to root ØDenotes a solution ØA sequence of states § Edges of the tree ØDenote legal “moves” that lead to new states X O X O O

Mini-Max Algorithm § The Mini-Max algorithm is applied in two player games Ø tic-tac-toe, checkers, chess, Amazons, Nim etc. . § Properties of these games Ø Can be described by a set of rules Ø Finite state space Ø Can look ahead to see what moves are possible Ø Full information games § Each player knows all the possible moves of the opponent

Min-Max Algorithm § Each terminal position has some value Ø Terminal position is a position where the game is over or a place where a value can be assigned to a board (base case) § In Tic. Tac. Toe a game may be over with a tie, win or loss § The value of a non-terminal position P, is given by Ø v(P) = max - v(P') P' in S(P) § Where S(P) is the set of all successor positions of P § Minus sign is there because the other player is moving into position P’

Nega-Max - pseudo code Nega-max(P){ if P is terminal, return v(P) m = - for each P' in S(P) v = -(Nega-max(P')) if m < v then m = v return m }

How fast is mini-max? § Minimax is pretty slow even for a modest depth. § It is basically a brute force search. § What is the running time? ØEach level of the tree has some average b moves per level. We have d levels. So the running time is _____

Quiz Time § It is your turn. You have n possible moves you can make. Each subsequent stage the number of moves is reduced by 50%. You want to analyze the decision tree up to d levels. How many moves must be analyzed before you decide to make a move with mini-max?

A Game § Consider the following two player game: Ø The game starts with a pile of n pebbles. Ø Players alternate taking pebbles from the pile Ø Each turn, you are allowed to take either 1, 2, 5, or sqrt(x) pebbles from the pile where x is the current number of pebbles in the pile. Ø The game continues until there are no more pebbles. Ø The person who takes the last pebble loses. Ø Let F(x) be a function that returns 1 if the current player will win with a pile of x pebbles Ø F(x)=-1 if the current player loses.

Problem ctd… § Case 1: Ø Suppose you are allowed to take only one pebble at a time. Write a recursive definition of F(x) that maximizes your advantage and minimizes your opponents § Case 2: Ø Suppose you are now allowed to take 1, 3, 5 or sqrt(x) from the pile. Write a recursive definition that maximizes your advantage and minimizes your opponents

Alpha-Beta Pruning Reducing search complexity

Alpha-Beta Pruning

Alpha Beta speedup § Alpha Beta is always a win: Ø Returns the same result as Minimax, Ø Is minor modification to Minimax § Claim: The optimal Alpha Beta search tree is O( bd/2 ) nodes or the square root of the number of nodes in the regular Minimax tree. Ø Can enable twice the depth § In chess branching is about 38. In practice Alpha Beta reduces it to about 6 and enables 10 ply searches on a PC. § Question: How many nodes need to be analyzed in chess Ø If mini-max is used w/o Alpha-Beta? Ø If mini-max is used w/ Alpha-Beta?

Alpha Beta Pruning Theorem: Let v(P) be the value of a position P. Let X be the value of a call to AB( , ). Then one of the following holds: v(P) ≤ and X ≤ < v(P) < and X = v(P) ≤ v(p) and X ≥ Suppose we take a chance and reduce the size of the infinite window. What might happen?

Aspiration Window: § Suppose you had information that value of a position was probably close to 2 (say from the result of shallower search) § Instead of starting with an infinite window, start with an “aspiration window” around 2 (e. g. , (1. 5, 2. 5)) to get more pruning. ØIf the result is in that range you are done. ØIf outside the range you don’t know the exact value, only a bound. Repeat with a different range.

Heuristics

Heuristic search techniques Heuristic = aid to problem-solving § Alpha-beta is one way to prune the game tree…

Move Ordering § Explore decisions in order of likely success to get early alpha beta pruning Ø Guide search with estimator functions that correlate with likely search outcomes or Ø track which moves tend to cause beta cutoff § Heuristic estimate of the cost (time) of searching one move versus another Ø Search the easiest move first

Timed moves § Not uncommon to have a limited time to make a move. May need to produce a move in say 2 minutes. § How do we ensure that we have a good move before the timer goes off?

Iterative Deepening § Evaluate moves to successively deeper and deeper depths: Ø Start with 1 -ply search and get best move(s). Fast Ø Next do 2 -ply search using the previous best moves to order the search. Ø Continue to increase depth of search. § If some depth takes too long, fall back to previous results (timed moves).

Iterative Deepening § Save best moves of shallower searches to control move ordering. § Need to search the same part of the tree multiple times but improved move ordering more than makes up for this redundancy § Difficulty: Ø Time control: each iteration needs about 6 x more time Ø What to do when time runs out?

Transposition Tables § Minimax and Alpha Beta (implicitly) build a tree. But what is the underlying structure of a game? § Different sequences of moves can lead to the same position. § Several game positions may be functionally equivalent (e. g. symmetric positions). § Use a Hash Table to save the known results

Transposition Tables § Memoize: hash table of board positions to get Ø Value for the node § Upper Bound, Lower Bound, or Exact Value § Be extremely careful with alpha beta as may only know a bound at that position. Ø Best move at the position § Useful for move ordering for greater pruning! § Which positions to save? Ø Sometimes obvious from context Ø Ones in which more search time has been invested Ø Collisions: Simply overwrite

Limited Depth Search Problems § Horizon effect: if the bad news over the search depth Ø White queen takes knight, and the evaluation function will report that white is up a knight Ø One level down, black has the reply pawn takes queen § Quiescence search: Look for tactical moves by the opponent from a depth node

Games Nim and Amazons

Game of Nim § Arbitrary number of piles. Players alternate. A player can take any number of coins from a pile. Player who take the last coin (or pile) wins § Let’s start with a simple configuration of Nim and use Minimax to select a move. § Our initial configuration consists of three piles, with 1, 2, and 3 pennies in each pile. § We can represent this configuration compactly by writing it as (1, 2, 3). Each position in this list represents the number of pennies in that stack. Order does not matter (I can just rearrange the stacks).

Drawing the Game Tree The first thing we need to take care of is drawing the game tree. (1, 2, 3) (1, 1, 3) (1, 2, 2) (1, 3) (1, 1, 2) (1, 2) One level of the tree. Whose move is it now?

Nim Game Tree (1, 2, 3) (2, 2) (1, 1, 3) (2, 1) (3) (2) (1, 2, 2) (1) (3) (1, 1) (1, 2, 1) (1, 1, 2) (1, 1, 1) (1, 2) (1, 1) (1, 1)

Nim Game Graph (1, 2, 3) (2, 2) (1, 2) (3) (1, 1, 3) (2) (1) (2) (1, 2, 2) (3) Us (1, 3) (1, 1, 2) (1, 1, 1) (1) win (1) (1, 2) loss (1) (1, 1) win loss (1, 1, 1) Them Us

Game of Amazons

Game of Amazons § Invented in 1988 by Argentinian Walter Zamkauskas. § Several programming competitions and yearly championships. § Distantly related to Go. § Active area in combinatorial game theory.

Amazons Board § Chess board 10 x 10 § 4 White and 4 Black chess Queens (Amazons) and Arrows § Starting configuration § White moves first

Amazons Rules § Each move consists of two steps: 1. Move an amazon of own color. 2. This amazon has to throw an arrow to an empty square where it stays. § Amazons and arrows move as a chess Queen as long as no obstacle blocks the way (amazon or arrow) § Players alternate moves. § Player who makes last move wins.

Amazons challenges § Absence of opening theory § Branching factor of more than 1000 § Often 20 reasonable moves § Need for deep variations § Opening book >30, 000 moves

AMAZONG § World’s best computer player by Jens Lieberum § Distinguishes three phases of the game: ØOpening at the beginning ØMiddle game ØFilling phase at the end See http: //jenslieberum. de/amazong. html

Amazons Opening § Main goals in the opening ØObtain a good distribution of amazons ØBuild large regions of potential territory ØTrap opponent’s amazons