Computer Chess Pasi Frnti 5 5 2016 History

History • • • Mechanical Turk [von Kempelen, 1770] Computer Chess problem and Minmax

Minmax search Fixed depth 3 17 2 12 15 25 0 2 5 3

Alpha-beta pruning Upper bound Lower bound Terminate when < Max! Min! =3 Max! =2

Alpha-beta pruning Max! =3 Min! =15 15 3 Max! 3 3 2 17 2

Alpha-beta pruning =3 Max! 3 =3 Min! =3 15 3 Max! 3 2 15

Alpha-beta pruning Final result: 3 3 =3 Max! 3 3 Min! 15 3 3

Order of moves • Very important! • More accurate value More can be pruned

Heuristics for ordering • If the same position reached again; use the previous best

Examples of captures 1. 1. 2. Nxe 5 -2 Bxc 6 Nxe 5 +1

Horizon effect black to move Two plies search: 1. … Bxh 2 2. Kxh

Horizon effect black to move Nine ply search: 1. … b 2 2. Bxb

Horizon effect Pushing the disaster beyond horizon Wha blac t can k do ?

Quiescent position Search deeper until quiescent position Non-quiescent situations: – – Winning captures (or

Check is non-quiescent • Material is even • Check is forcing • Black will

Pawn promotion White to play and win 1. e 7 2. Rb 4 3.

Pawn promotion White wins after 10 moves All moves since e 7 were non-quiescent:

Limits of non-quiescent 1. Nf 6+ 2. Qg 4+ 3. Kg 2 gxf 6

Limits of non-quiescent Unstoppable: 4. Rh 1 X

Evaluation criteria • • • Material Piece-square tables Pawn structure Piece-specific evaluation Mobility King

Material Basic values 9 5 3 3 1 Double bishops 6+ Double knights ?

Pawn values opening Progress Center important

Piece-square table Knight -50, -40, -30, -40, -50, -40, -20, 0, 0, -20, -40,

Piece-square table King during mid-game -30, -40, -40, -50, -50, -40, -40, -30, -20,

Piece-square table end game -50, -40, -30, -20, -30, -40, -50, -30, -20, -10,

Pawn structure Backward Opposed Lever Passed Unsupported Passed Isolated

Transposition table 1. e 4 2. Nf 3 e 5 Nc 6 1. Nf

Opening book • Known games from grand masters. • Include moves played often. •

Deep Blue – Kasparov 1997 Decisive game 6 Kasparov played h 6… Ne 4

Deep Blue game 6 Continuation of the game 8. 9. 10. Nxe 6 Qe

End game database • Known games from grand masters. • Include moves played often.

Optimizing by Genetic Algorithm Student project by P. Aksenov (Joensuu, Finland) • • Population

Giraffe Student project by M. Lai (London, UK) • Temporal-difference learning: play fixed depth

Deep learning (neural network) [-1, +1] Initialization by bootstrapping with only material score

Probability-limited search • Traditional: fixed depth with exception of quiescent • Alternative: study deepers

Current computer strengths 2015 • • Grand master International master FIDE master Candidate master

References 1. Monty Newborn (1996): Kasparov versus Deep Blue: Computer Chess Comes of Age,

Slides: 57

Download presentation

Computer Chess Pasi Fränti 5. 5. 2016

History • • • Mechanical Turk [von Kempelen, 1770] Computer Chess problem and Minmax search [Shannon, 1950] Turings Chess test Deep Blue wins Garry Kasparov [1987] Best computers today (~3200); Magnus Carlsen (2853) Machine learning for optimizing weights of material and positional factors

Pieces and board

Move notations

Minmax search

Chess game as search tree

Minmax search Fixed depth 3 17 2 12 15 25 0 2 5 3 https: //www. youtube. com/watch? v=f. J 4 u. Qpkn 9 V 0 2 14

Material typical values 9 5 3 3 1

Alpha-beta pruning Upper bound Lower bound Terminate when < Max! Min! =3 Max! =2 2 3 Min! 17 2 15 25 0 2 5 3 2 14

Alpha-beta pruning Max! =3 Min! =15 15 3 Max! 3 3 2 17 2 15 15 2 5 3 2 14

Alpha-beta pruning =3 Max! 3 =3 Min! =3 15 3 Max! 3 2 15 =2 2 Min! 3 17 2 15 2 3 2 14

Alpha-beta pruning Final result: 3 3 =3 Max! 3 3 Min! 15 3 3 3 2 17 2 3 2 15 15 2 3 3

Order of moves • Very important! • More accurate value More can be pruned • If branch of best move (oracle!) always searched first, branching factor n • Assume fixed amount of time allowed: d = search dept of minmax e = search dept of alpha-beta n = number of moves d n e n Twice as deep!

Heuristics for ordering • If the same position reached again; use the previous best move to search first again. • If a move was found good in sibling node, then assume it is good also in this node. • Capturing higher value piece = GOOD Capturing defended lower value piece = BAD • Queen promotions

Examples of captures 1. 1. 2. Nxe 5 -2 Bxc 6 Nxe 5 +1 Nxe 5 dxc 6 1. 2. 3. 4. Nxd 5 (+3) cxd 5 (+3) Bxd 5 (+1) Bxd 5 (+3) -2 Qxd 5 (+3) Qxd 5 (+9) Rxd 5 (+9) +1

Horizon effect black to move Two plies search: 1. … Bxh 2 2. Kxh 2 -2 Three plies search: 1. … Nbc 5 2. cxd 7 Nxc 7 -2

Horizon effect black to move Nine ply search: 1. … b 2 2. Bxb 2 c 3 3. Bxc 3 d 4 4. Bxd 4 e 5 5. Bxe 5 f 6 -4

Horizon effect Pushing the disaster beyond horizon Wha blac t can k do ?

Quiescent position Search deeper until quiescent position Non-quiescent situations: – – Winning captures (or any / major capture) Checks + evasions of checks Pawn promotions Pawn move to 7 th rank Earlier used also singular extension

Check is non-quiescent • Material is even • Check is forcing • Black will lose either Queen or another piece

Pawn promotion White to play and win 1. e 7 2. Rb 4 3. Kxc 4! 4. Nxg 6+ 5. Bf 6+ 6. Kd 5+ 7. h 4+ 8. g 4+ 9. Rf 4+ 10. e 4# Qa 3+ Qa 7+ Qxe 7 fxg 6 Qxf 6 Kg 5 Kf 5 hxg 4 Bxf 4

Pawn promotion White wins after 10 moves All moves since e 7 were non-quiescent: • • Check Forced move Capture Threat to promote

Limits of non-quiescent Looks quiet…

Limits of non-quiescent 1. Nf 6+ 2. Qg 4+ 3. Kg 2 gxf 6 Kh 8 Quiet move but…

Limits of non-quiescent Unstoppable: 4. Rh 1 X

Evaluation

Evaluation criteria • • • Material Piece-square tables Pawn structure Piece-specific evaluation Mobility King safety Threat Space Draw-ish-ness

Material Basic values 9 5 3 3 1 Double bishops 6+ Double knights ? ?

Pawn values opening Progress Center important

Pawn values end game

Piece-square table Knight -50, -40, -30, -40, -50, -40, -20, 0, 0, -20, -40, -30, 0, 15, 10, 0, -30, 5, 15, 20, 15, 5, -30, 0, 15, 20, 15, 0, -30, 5, 10, 15, 10, 5, -30, -40, -20, 5, 0, -20, -40, -50, -40, -20, -30, -20, -40, -50,

Piece-square table King during mid-game -30, -40, -40, -50, -50, -40, -40, -30, -20, -30, -40, -30, -20, -10, -20, -20, -10, 20, 0, 0, 20, 20, 30, 10, 30, 20

Piece-square table end game -50, -40, -30, -20, -30, -40, -50, -30, -20, -10, -20, -30, -10, 20, 30, 20, -10, -30, -10, 30, 40, 40, 30, -10, -30, -10, 20, 30, 20, -10, -30, 0, 0, -30, -50, -30, -30, -50

Pawn structure Backward Opposed Lever Passed Unsupported Passed Isolated

Can white promote?

Mobility 4 2 0 1 2 3

King safety

Transposition table 1. e 4 2. Nf 3 e 5 Nc 6 1. Nf 3 2. e 4 Nc 6 e 5 1. e 4 2. Nf 3 Nc 6 e 5 • Castling rights • En passant moves

Collected examples

Opening + end game databases

Opening book • Known games from grand masters. • Include moves played often. • Systematic self-play by computer until pre-defined depth. • Exclude moves leading to bad position.

Deep Blue – Kasparov 1997 Decisive game 6 Kasparov played h 6… Ne 4 and black has better position Knight sacrifice wins but only if white knows how to play the position. Typical computer chess programs don’t know how to do it.

Deep Blue game 6 Continuation of the game 8. 9. 10. Nxe 6 Qe 7 0 -0 fxe 6 Bg 6+ Nine moves later after 19. c 5 1 -0 black resigns

End game database • Known games from grand masters. • Include moves played often. • Systematic self-play by computer until pre-defined depth. • Exclude moves leading to bad position.

Machine learning

Optimizing by Genetic Algorithm Student project by P. Aksenov (Joensuu, Finland) • • Population of computer players. Random initial parameters. Tournament playing to find fitness. Crossover for new parameter combies.

Initial value range

Result of optimization bishop & knight

Giraffe Student project by M. Lai (London, UK) • Temporal-difference learning: play fixed depth (12 plies). • Fitness based on stability. • Non-linear combination of 363 features. • Optimizes parameters and selection of the next branch in the search.

Giraffe playing strength

Representation

Deep learning (neural network) [-1, +1] Initialization by bootstrapping with only material score

Probability-limited search • Traditional: fixed depth with exception of quiescent • Alternative: study deepers when fewer alternatives.

Current computer strengths 2015 • • Grand master International master FIDE master Candidate master >2500 >2400 >2300 >2200

References 1. Monty Newborn (1996): Kasparov versus Deep Blue: Computer Chess Comes of Age, Springer. 2. Feng-hsiung Hsu (2002): Behind Deep Blue: Building the Computer that Defeated the World Chess Champion, Princeton University Press. 3. Petr Aksenov (2004): Genetic algorithms for optimising chess position scoring, MSc thesis, Computer Science, Univ. of Joensuu, Finland. 4. Matthew Lai (2015): Giraffe: Using Deep Reinforcement Learning to Play Chess, Manuscript (submitted). ar. Xiv: 1509. 01549 v 2

Wor ki ng s Giraffe pace