CAP 6938 Neuroevolution and Developmental Encoding Competitive Coevolution

Example: I Want to Evolve a Go Player • Go is one of the

Generally: Fitness May Be Difficult to Formalize • Optimal policy in competitive domains unknown

Competitive Coevolution • Coevolution: No absolute fitness function • Fitness depends on direct comparisons

The Arms Race is an AI Dream • Computer plays itself and becomes champion

So Who Plays Against Whom? • If evaluation is expensive, everyone can’t play everyone

Challenges with Choosing the Right Opponents • Red Queen Effect: Running in Circles –

Heuristic in NEAT: Utilize Species Champions Each individual plays all the species champions and

Hall of Fame (HOF) (Rosin and Belew 1997) • Keep around a list of

More Recently: Pareto Coevolution • Separate learners and tests • The tests are rewarded

Choosing Opponents Isn’t Everything • How can new solutions be continually created that maintain

Answer: Complexification • Fixed-length genomes limit progress • Dominant strategies that utilize the entire

Test Domain: Robot Duel • • Robot with higher energy wins by colliding with

Experimental Setup • 13 complexifying runs, 15 fixed-topology runs • 500 generations per run

Performance is Difficult to Evaluate in Coevolution • How can you tell if things

Expensive Method: Master Tournament (Cliff and Miller 1995; Floreano and Nolfi 1997) • Compare

Strict and Efficient Performance Measure: Dominance Tournament (Stanley & Miikkulainen 2002)

Result: Evolution of Complexity • As dominance increases so does complexity on average •

Cooperative Coevolution • Groups attempt to work with each other instead of against each

Summary • • • Picking best opponents Maintaining and elaborating on strategies Measuring performance

Next Topic: Real-time NEAT (rt. NEAT) • Simultaneous and asynchronous evaluation • Non-generational •

Slides: 27

Download presentation

CAP 6938 Neuroevolution and Developmental Encoding Competitive Coevolution Dr. Kenneth Stanley October 16, 2006

Example: I Want to Evolve a Go Player • Go is one of the hardest games for computers • I am terrible at it • There are no good Go programs either (hypothetically) • I have no idea how to measure the fitness of a Go player • How can I make evolution solve this problem?

Generally: Fitness May Be Difficult to Formalize • Optimal policy in competitive domains unknown • Only winner and loser can be easily determined • What can be done?

Competitive Coevolution • Coevolution: No absolute fitness function • Fitness depends on direct comparisons with other evolving agents • Hope to discover solutions beyond the ability of fitness to describe • Competition should lead to an escalating arms race

The Arms Race

The Arms Race is an AI Dream • Computer plays itself and becomes champion • No need for human knowledge whatsoever • In practice, progress eventually stagnates (Darwen 1996; Floreano and Nolfi 1997; Rosin and Belew 1997)

So Who Plays Against Whom? • If evaluation is expensive, everyone can’t play everyone • Even if they could, a lot of candidates might be very poor • If not everyone, who then is chosen as competition for each candidate? • Need some kind of intelligent sampling

Challenges with Choosing the Right Opponents • Red Queen Effect: Running in Circles – B dominates A – C dominates B – A dominates C • Overspecialization – Optimizing a single skill to the neglect of all others – Likely to happen without diverse opponents in sample • Several other failure dynamics

Heuristic in NEAT: Utilize Species Champions Each individual plays all the species champions and keeps a score

Hall of Fame (HOF) (Rosin and Belew 1997) • Keep around a list of past champions • Add them to the mix of opponents • If HOF gets too big, sample from it

More Recently: Pareto Coevolution • Separate learners and tests • The tests are rewarded for distinguishing learners from each other • The learners are ranked in Pareto layers – Each test is an objective – If X wins against a superset of tests that Y wins against, then X Pareto-dominates Y – The first layer is a nondominated front – Think of tests as objectives in a multiobjective optimization problem • Potentially costly: All learners play all tests De Jong, E. D. and J. B. Pollack (2004). Ideal Evaluation from Coevolution Evolutionary Computation, Vol. 12, Issue 2, pp. 159 -192, published by The MIT Press.

Choosing Opponents Isn’t Everything • How can new solutions be continually created that maintain existing capabilities? • Mutations that lead to innovations could simultaneously lead to losses • What kind of process ensures elaboration over alteration?

Alteration vs. Elaboration

Answer: Complexification • Fixed-length genomes limit progress • Dominant strategies that utilize the entire genome must alter and thereby sacrifice prior functionality • If new genes can be added, dominant strategies can be elaborated, maintaining existing capabilities

Test Domain: Robot Duel • • Robot with higher energy wins by colliding with opponent Moving costs energy Collecting food replenishes energy Complex task: When to forage/save energy, avoid/pursue?

Robot Neural Networks

Experimental Setup • 13 complexifying runs, 15 fixed-topology runs • 500 generations per run • 2 -population coevolution with hall of fame (Rosin & Belew 1997)

Performance is Difficult to Evaluate in Coevolution • How can you tell if things are improving when everything is relative? – Number of wins is relative to each generation • No absolute measure is available • No benchmark is comprehensive

Expensive Method: Master Tournament (Cliff and Miller 1995; Floreano and Nolfi 1997) • Compare all generation champions to each other • Requires n^2 evaluations – An accurate evaluation may involve e. g. 288 games • Defeating more champions does not establish superiority

Strict and Efficient Performance Measure: Dominance Tournament (Stanley & Miikkulainen 2002)

Result: Evolution of Complexity • As dominance increases so does complexity on average • Networks with strictly superior strategies are more complex

Comparing Performance

Summary of Performance Comparisons

The Superchamp

Cooperative Coevolution • Groups attempt to work with each other instead of against each other • But sometimes it’s not clear what’s cooperation and what’s competition • Maybe competitive/cooperative is not the best distinction? – Newer idea: Compositional vs. test-based

Summary • • • Picking best opponents Maintaining and elaborating on strategies Measuring performance Different types of coevolution Advanced papers on coevolution: Ideal Evaluation from Coevolution by De Jong, E. D. and J. B. Pollack (2004) Monotonic Solution Concepts in Coevolution by Ficici, Sevan G. (2005)

Next Topic: Real-time NEAT (rt. NEAT) • Simultaneous and asynchronous evaluation • Non-generational • Useful in video games and simulations • NERO: Video game with rt. NEAT -Shorter symposium paper: Evolving Neural Network Agents in the NERO Video Game by Kenneth O. Stanley and Risto Miikkulainen (2005) -Optional journal (longer, more detailed) paper: Real-time Neuroevolution in the NERO Video Game by Kenneth O. Stanley and Risto Miikkulainen (2005) -http: //Nerogame. org -Extra coevolution papers