Artificial Intelligence AO and Minimax L Manevitz All

Goals • AO* • Game Playing Algorithm : – Mini-max with a-b. – References:

A/O* algorithm • Data Structure – Graph – Marked Connectors (down; unlike A*) –

Or Connector AND Connector All rights reserved L. Manevitz Lecture 3 4

And/OR Graph All rights reserved L. Manevitz Lecture 3 5

Solution Subgraph All rights reserved L. Manevitz Lecture 3 6

Solution Subgraph All rights reserved L. Manevitz Lecture 3 7

Solution Subgraph G’ of G from n to terminals T • If n is

Heuristic Values: estimated cost to solution set 0 2 4 1 2 0 All

Montone Restriction • h(n) <= c + h(n 1) + h(n 2) + …

Cost (q(n) ) Values • If n has no successors q(n) = h (n)

Basic Idea of A/O* • First top-down graph growing picks out best available partial

AO* Algorithm 1. Create G = <s> ; q(s) = h (s) If s

AO* Algorithm cont. 4. Let S : = {n}. 5. Until S = f

Tracing the Algorithm 3 2 1 1 All rights reserved L. Manevitz Lecture 3

Tracing the Algorithm 4 5 1 4 4 1 All rights reserved L. Manevitz

Tracing the Algorithm 5 5 1 4 4 2 2 0 All rights reserved

Tracing the Algorithm 5 5 1 4 2 0 0 All rights reserved L.

Optimality of A/O*? • If h (n) <= c + h (n 1) +

Note • Acyclicity means that S is empty eventually. • Choice of non-terminal node

Game Playing Algorithm • What are games ? • Notion of Winning Strategy –

Maximilian vs. Minerva Winning Strategy (7, Min) (6, 1, Max) (5, 1, 1, Min)

Maximilian vs. Minerva (7, Min) (6, 1, Max) (5, 1, 1, Min) (5, 2,

Mini-max Search • See if limit of search has been reached : Yes :

Example X X OO X X All rights reserved OO XX X OO X

Example cont. • e(p) = number of directions open for Max – number of

Example cont. Min move X OO X X 1 X OO OX X X

Example cont. Min move OO XX X - inf O OO XX X O

Example cont. Min move OO X XX - inf OOO X XX OO OX

Example cont. Min move OO X X X - inf OOO X X X

Example cont. Min move X OO XX - inf OOO XX X OO OX

Example cont. Max move 1 X OO X X All rights reserved X OO

Mini-max Algorithm Mini-max (Position, Depth) 1. If Deep. Enough (Position, Depth) 1. then Return

Mini-max Algorithm cont. 3. If Depth Odd (Max node) 1. then Loop until Successors

Mini-max Algorithm cont. 4. If Depth Even (Min node) 1. then Loop until Successors

a-b Mini-max § a– values at Max nodes never decrease. § b-values at Min

Cut - Off • Below any MIN node having b-value smaller than a-value of

Computing a-b § a of Max node : current largest backed up value of

a-b Mini-max • Keep track of a, b values and send on to children

Example – b Cut-Off a >=2 <=1 =2 Max b Min Max 2 7

a-b Mini-max Algorithm 1. Determine if the level is : 1. 2. 3. 4.

a-b Mini-max Algorithm cont. 2. If the level is the top level : 1.

a-b Mini-max Algorithm cont. 4. If the level is a minimizing level : 1.

a-b Mini-max Algorithm cont. 5. If the level is a maximizing level : 1.

a≥ 0 Max node b≤ 0 Min node a≥ 0 a≥ 2 Max node

Cut Off Max node Min node Example 0 5 -3 3 b=0 b= -3

Other Adjustments • Progressive Deepening (Analyze depth 1, depth 2, …). • Heuristic Pruning

Homework Number 1 • Implement BOTH the Mini-max and • The Alpha-Beta Algorithm for

Slides: 54

Download presentation

A/O* algorithm • Data Structure – Graph – Marked Connectors (down; unlike A*) – Costs q() maintained on nodes – SOLVED markings – Note: We’ll discuss on acyclic graphs. All rights reserved L. Manevitz Lecture 3 3

Solution Subgraph G’ of G from n to terminals T • If n is in T, G’ is just the singleton n. • Otherwise n has one connector to set of nodes n 1, n 2, …, nk. – There is a Solution graph from each ni to T. – G’ is n, that connector, the nodes n 1, …, nk – Plus … the solution graphs from each of the ni. All rights reserved L. Manevitz Lecture 3 8

Montone Restriction • h(n) <= c + h(n 1) + h(n 2) + … h(nk) Where c is cost of connector between n and set of n 1, … , nk. This guarantees that h(n) <= h*(n). All rights reserved L. Manevitz Lecture 3 10

Cost (q(n) ) Values • If n has no successors q(n) = h (n) • Otherwise working from bottom, – q(n) = connector cost + sum of q(successors) Pick smallest of above; and mark direction. If that direction has all successors SOLVED then n is marked SOLVED. All rights reserved L. Manevitz Lecture 3 11

Basic Idea of A/O* • First top-down graph growing picks out best available partial solution sub-graph from explicit graph. • One leaf node of this graph is expanded • Second, bottom-up cost-revising, connector -marking, SOLVE-labeling. All rights reserved L. Manevitz Lecture 3 12

AO* Algorithm 1. Create G = <s> ; q(s) = h (s) If s e TERM mark s SOLVED 2. Until s labeled SOLVED do: 1. Compute G’ partial solution subgraph of G by tracing down marked connectors in G from s. 2. Select n in G’, n not in TERM, n a leaf. 3. Expand n , place successors in G, for each successor not already in G let q(successor)=h (successor). Label SOLVED all successors in TERM. (If no successors, reset q(n) : = infinity ). All rights reserved L. Manevitz Lecture 3 13

AO* Algorithm cont. 4. Let S : = {n}. 5. Until S = f do : 1. 2. 3. 4. Remove a node, m, from S which has no descendent in G also in S (minimal node). Revise cost for m, (check each connector from m) q(m)=min [c +q(n 1)+…+q(nk )]. Mark chosen connector. If all successors their connectors are SOLVED then mark m SOLVED. If m SOLVED or changed q(m) then add to S all “preferred” parents of m. End. 6. End. All rights reserved L. Manevitz Lecture 3 14

Optimality of A/O*? • If h (n) <= c + h (n 1) + … h (nk) for all nodes n and connectors THEN OPTIMAL path. • If h(term) = 0 this implies h (n) <= h*(n) (optimistic) All rights reserved L. Manevitz Lecture 3 21

Note • Acyclicity means that S is empty eventually. • Choice of non-terminal node in partial solution graph to expand? – Best with HIGHEST h* (must expand anyhow; quicker to change mind) All rights reserved L. Manevitz Lecture 3 22

Game Playing Algorithm • What are games ? • Notion of Winning Strategy – Von – Neumann Morgernstern Theorem • Use of heuristics to pick best move. • Look ahead : So use backed up values to choose the best move. • Mini-max algorithm. All rights reserved L. Manevitz Lecture 3 23

Maximilian vs. Minerva Winning Strategy (7, Min) (6, 1, Max) (5, 1, 1, Min) (5, 2, Max) (4, 2, 1, Min) (4, 1, 1, 1, Max) (3, 2, 2, Min) (3, 2, 1, 1, Max) (3, 1, 1, Min) (2, 1, 1, 1, Max) All rights reserved (4, 3, Max) (3, 3, 1, Min) (2, 2, 2, 1, Max) (2, 2, 1, 1, 1, Min) L. Manevitz Lecture 3 24

Maximilian vs. Minerva (7, Min) (6, 1, Max) (5, 1, 1, Min) (5, 2, Max) (4, 2, 1, Min) (4, 1, 1, 1, Max) (3, 2, 2, Min) (3, 2, 1, 1, Max) (3, 1, 1, Min) (2, 1, 1, 1, Max) All rights reserved (4, 3, Max) (3, 3, 1, Min) (2, 2, 2, 1, Max) (2, 2, 1, 1, 1, Min) L. Manevitz Lecture 3 25

Mini-max Search • See if limit of search has been reached : Yes : • Compute static evaluation of current position + return value. Otherwise : • If maximizing level – apply Mini-max to each child. Return maximum of results. • If minimizing level – apply Mini-max to each child. Return minimum of results. All rights reserved L. Manevitz Lecture 3 26

Example cont. • e(p) = number of directions open for Max – number of directions open for Min • e(p) = inf if win for Max • e(p) = - inf if win for Min • e(p) = 6 – 4 = 2 O X All rights reserved L. Manevitz Lecture 3 28

Mini-max Algorithm Mini-max (Position, Depth) 1. If Deep. Enough (Position, Depth) 1. then Return <Static (Position, Depth), f> 2. Successors : = Move. Gen (Position, Depth) All rights reserved L. Manevitz Lecture 3 35

Mini-max Algorithm cont. 3. If Depth Odd (Max node) 1. then Loop until Successors = f 1. 2. 3. 4. 5. Best = - inf Remove m from Successors <Value, Path> : = Minimax (m, Depth+1) If Value > Best then : 1. Best : = Value 2. Best. Path : = m^Path End Loop 2. Return <Best, Best. Path> All rights reserved L. Manevitz Lecture 3 36

Mini-max Algorithm cont. 4. If Depth Even (Min node) 1. then Loop until Successors = f 1. 2. 3. 4. 5. Best = inf Remove m from Successors <Value, Path> : = Minimax (m, Depth+1) If Value < Best then : 1. Best : = Value 2. Best. Path : = m^Path End Loop 2. Return <Best, Best. Path> All rights reserved L. Manevitz Lecture 3 37

a-b Mini-max • Keep track of a, b values and send on to children : 1) Search discontinued below any Min node with b <= an a of one of it’s ancestors. Set final value of node to be this b value. 2) Search discontinued below any Max node with a >= a b of one of it’s ancestors. Set final value of node to be this a value. All rights reserved L. Manevitz Lecture 3 41

a-b Mini-max Algorithm 1. Determine if the level is : 1. 2. 3. 4. The top level. The limit of search has been reached. The level is a minimizing level. The level is a maximizing level All rights reserved L. Manevitz Lecture 3 43

a-b Mini-max Algorithm cont. 2. If the level is the top level : 1. Let alpha be –inf and let beta be inf. 3. If the limit of search has been reached : compute the static value of the current position relative to the appropriate player + Return result. All rights reserved L. Manevitz Lecture 3 44

a-b Mini-max Algorithm cont. 4. If the level is a minimizing level : 1. Until all children are examined with Minimax or alpha is bigger then beta : 1. Set beta to the smaller of the given beta values and the smallest value so far reported by Minimax working on the children. 2. Use Minimax on the next child of the current position, handing this new application of Minimax the current alpha and beta. 2. Report beta. All rights reserved L. Manevitz Lecture 3 45

a-b Mini-max Algorithm cont. 5. If the level is a maximizing level : 1. Until all children are examined with Minimax or alpha is bigger then beta : 1. Set alpha to the larger of the given alpha values and the biggest value so far reported by Minimax working on the children. 2. Use Minimax on the next child of the current position, handing this new application of Minimax the current alpha and beta. 2. Report alpha. All rights reserved L. Manevitz Lecture 3 46

a≥ 3 a≥ 1 a≥ 0 β≤ 1 a≥ 3 a≥ 1 a≥ 4 β≤ 3 β≤ 1 a≥ 1 β≤ 4 a≥ 3 a≥ 4 β=-3 β=1 Cut Off Max node Min node β≤ 3 β≤-3 β=4 0 5 5 1 -3 3 3 -2 3 -3 0 2 All rights reserved -3 0 5 -5 5 2 L. Manevitz 0 1 4 5 Lecture 3 3 -3 0 1 -1 48

Other Adjustments • Progressive Deepening (Analyze depth 1, depth 2, …). • Heuristic Pruning (Order moves plausibly). • Futility Cut-Off. • Heuristic Continuation (Waiting for Quiescence). • Horizon effect. • Secondary Search. All rights reserved L. Manevitz Lecture 3 51

Homework Number 1 • Implement BOTH the Mini-max and • The Alpha-Beta Algorithm for a Beloved Game. • Suggestion : Othello (not allowed), Checkers, Chinese Checkers, Othello (for losers), Chess etc. • Due date • (Penalty for lateness). All rights reserved L. Manevitz Lecture 3 52

Homework Number 1 • Implement BOTH the Mini-max AND • The Alpha-Beta Algorithm for a Beloved Game. • Suggestion : Checkers (Damka), Othello (losers), regular Othello not allowed, Chess etc. • Due date Dec 20 (before class) • (Penalty for lateness): -4 up to one week. then -4 each class (before class) All rights reserved L. Manevitz Lecture 3 54