Evolving HyperHeuristics using Genetic Programming Supervisor Moshe Sipper
- Slides: 30
Evolving Hyper-Heuristics using Genetic Programming Supervisor: Moshe Sipper Achiya Elyasaf
Overview Introduction • Searching Games State-Graphs • • • Uninformed Search Heuristics Informed Search Evolving Heuristics Previous Work • Rush Hour • Free. Cell 2
Representing Games as State-Graphs Every puzzle/game can be represented as a state graph: • In puzzles, board games etc. , every piece move can be counted as a different state • In computer war games etc. – the place of the player / the enemy, all the parameters (health, shield…) define a state 3
Rush-Hour as a state-graph 4
Searching Games State-Graphs Uninformed Search BFS – Exponential in the search depth DFS – Linear in the length of the current search path. BUT: • We might “never” track down the right path. • Usually games contain cycles Iterative Deepening: Combination of BFS & DFS • Each iteration DFS with a depth limit is performed. • Limit grows from one iteration to another • Worst case - traverse the entire graph 5
Searching Games State-Graphs Uninformed Search Most of the game domains are PSPACEComplete! Worst case - traverse the entire graph We need an informed-search! 6
Searching Games State-Graphs Heuristics h: states -> Real. • For every state s, h(s) is an estimation of the • • • minimal distance/cost from s to a solution h is perfect: an informed search that tries states with highest h-score first – will simply stroll to solution For hard problems, finding h is hard Bad heuristic means the search might never track down the solution We need a good heuristic function to guide informed search 7
Searching Games State-Graphs Informed Search Best-First search: Like DFS but select nodes with higher heuristic value first • Not necessarily optimal • Might enter cycles (local extremum) A*: • Holds closed and sorted (by h-value) open lists. • Best node of all open nodes is selected Maintenance and size of open and closed is not admissible 8
Searching Games State-Graphs Informed Search (Cont. ) IDA*: Iterative-Deepening with A* • The expanded nodes are pushed to the DFS stack • by descending heuristic values Let g(si) be the min depth of state si: Only nodes with f(s)=g(s)+h(s)<depth-limit are visited Near optimal solution (depends on path-limit) The heuristic need to be admissible 9
Overview Introduction • Searching Games State-Graphs • • • Uninformed Search Heuristics Informed Search Evolving Heuristics Previous Work • Rush Hour • Free. Cell 13
Evolving Heuristics For H 1, … , Hn – building blocks (not necessarily admissible or in the same range), How should we choose the fittest heuristic? • Minimum? Maximum? Linear combination? GA/GP may be used for: • Building new heuristics from existing building blocks • Finding weights for each heuristic (for applying linear combination) • Finding conditions for applying each heuristic • H should probably fit stage of search • E. g. , “goal” heuristics when assuming we’re close 14
Evolving Heuristics: GA W 1=0. 3 W 2=0. 01 W 3=0. 2 … Wn=0. 1 15
Evolving Heuristics: GP And + ≤ H 1 H 2 H 5 H 2 ≥ 0. 4 * e n o C False Tru n o i dit If 0. 7 * H 1 / H 1 0. 1 16
Evolving Heuristics: Policies Condition Result Condition 1 Heuristics Weights 1 Condition 2 Heuristics Weights 2 Condition n Heuristics Weights n Default Heuristics Weights 17
Evolving Heuristics: Fitness Function 18
Overview Introduction • Searching Games State-Graphs • • • Uninformed Search Heuristics Informed Search Evolving Heuristics Previous Work • Rush Hour • Free. Cell 19
Rush Hour GP-Rush [Hauptman et al, 2009] Bronze Humie award 20
Domain-Specific Heuristics Hand-Crafted Heuristics / Guides: • Blocker estimation – lower bound (admissible) • Goal distance – Manhattan distance • Hybrid blockers distance – combine above two • Is Move To Secluded – did the car enter a secluded area? • Is Releasing Move 21
Policy “Ingredients” Functions & Terminals: Terminals Sets Conditions Results Is. Move. To. Secluded, is. Releasing. Move, g, Phase. By. Distance, Phase. By. Blockers, Number. Of. Syblings, Difficulty. Level, Blockers. Lower. Bound, Goal. Distance, Hybrid, 0, 0. 1, … , 0. 9 , 1 If, AND , OR , ≤ , ≥ +, * 26
Coevolving (Hard) 8 x 8 Boards G G F S S F H I I H P K K K M RED 27
Results Average reduction of nodes required to solve test problems, with respect to the number of nodes scanned by a blind search: Heuristic: Problem ID H 1 H 2 H 3 Hc Policy 6 x 6 100% 28% 6% -2% 30% 60% 8 x 8 100% 31% 25% 30% 50% 90% 28
Results (cont’d) Time (in seconds) required to solve problems JAM 01. . . JAM 40: 29
Free. Cell remained relatively obscure until Windows 95 There are 32, 000 solvable problems (known as Microsoft 32 K), except for game #11982, which has been proven to be unsolvable Evolving hyper heuristic-based solvers for Rush-Hour and Free. Cell [Hauptman et al, SOCS 2010] GA-Free. Cell: Evolving Solvers for the Game of Free. Cell [Elyasaf et al, GECCO 2011] 30
Free. Cell (cont’d) As opposed to Rush Hour, blind search failed miserably The best published solver to date solves 96% of Microsoft 32 K Reasons: • High branching factor • Hard to generate a good heuristic 31
Learning Methods: Random Deals Which deals should we use for training? First method tested - random deals • This is what we did in Rush Hour • Here it yielded poor results • Very hard domain 32
Learning Methods: Gradual Difficulty Second method tested - gradual difficulty • Sort the problems by difficulty • Each generation test solvers against 5 deals from the current difficulty level + 1 random deal 33
Learning Methods: Hillis-Style Coevolution Third method tested - Hillis-style coevolution using “Hall-of-Fame”: • • A deal population is composed of 40 deals (=40 individuals) + 10 deals that represent a hall-offame Each hyper-heuristic is tested against 4 deal individuals and 2 hall-of-fame deals Evolved hyper-heuristics failed to solve almost all Microsoft 32 K! Why? 34
Learning Methods: Rosin-style Coevolution Fourth method tested - Rosin-style coevolution: • Each deal individual consists of 6 deals • Mutation and crossover: p 1 11897 p 2 28371 18923 p 1 11897 3042 23845 7364 9834 12 2015 23845 7364 17987 5984 30011 13498 17987 5984 35
Results Learning Method Gradual Difficulty Rosin-style coevolution Run Node Reduction Time Reduction Length Reduction Solved HSD 100% 96% GA-1 23% 31% 1% 71% GA-2 27% 30% -3% 70% GP - - Policy 28% 36% 6% 36% GA 87% 93% 41% 98% Policy 89% 90% 40% 99% 36
Thank you for listening any questions? 45
- Genetic programming vs genetic algorithm
- Genetic programming vs genetic algorithm
- International supervisor herbalife
- Genetic drift คือ
- What is the difference between genetic drift and gene flow
- Genetic drift vs genetic flow
- Evolving design
- A framework for clustering evolving data streams
- Key evolving signature
- Evolving
- Eduardo bassini
- Describe moshe the beadle
- Moshe paper
- Describe moshe, the beadle.
- Moshe mishali
- Moshe jacobson
- Moshe koppel
- Moshe banai
- Dr baruch banai
- Ted chadwick
- Moshe looks
- Yehuda ben moshe
- Geometric semantic genetic programming
- Using karyotypes to diagnose genetic disorders
- Perbedaan linear programming dan integer programming
- Greedy programming vs dynamic programming
- What is system program
- Integer programming vs linear programming
- Perbedaan linear programming dan integer programming
- In 8051 microcontroller, to and t1 are _______ interrupts.
- Binomial coefficient using dynamic programming