Intro to AI Genetic Algorithm Ruth Bergman Fall
Intro to AI Genetic Algorithm Ruth Bergman Fall 2002
Imitating Nature Aspect of the evolution of organisms: • The organisms that are ill-suited for an environment have little chances to reproduce (natural selection) • Conversely, the best fitting have more chances to survive and reproduce
Imitating Nature Reproduction: • Offspring are similar to their parents • Random mutations occur and they can bring to better (or worse) fitting individuals “The Origin of the Species on the Basis of Natural Selection” C. Darwin (1859) Encoding: • An organism is fully represented by its DNA string, that is a string over a finite alphabet (4 symbols) • Each element of this string is called gene
Genetic Algorithm (GA) • Developed by John Holland in the early 70’s • Optimization and machine learning techniques inspired from the process of natural evolution and evolutionary genetics – Solutions are encoded as chromosomes – Search proceeds through maintenance of a population of solutions – Reproduction favors “better” chromosomes – New chromosomes are generated during reproduction through processes of mutation and cross over, etc.
GA Framework selection Search space A 0 1 0 0 0 B 1 0 1 1 0 C 1 1 0 D 0 1 1 population cross over 1 0 1 0 0 1 1 1 0 mutation Fitness evaluation 1 0 0 1 1 1 0 reproduction
GA Procedure • Start with a population of N individuals 1. Apply the fitness function to all the individuals 2. Select the pairs of individuals for reproduction (repetition allowed). 3. Each pair generates two children (reproduction with cross-over) 4. Apply a random mutation to the children. The children become the next generation 5. Apply steps 1, 2, 3 until some termination criteria applies
Encoding Scheme • An individual (an organisms) is intended to be a possible solution for the problem you want to solve • An individual is represented by a binary string. Such a string is intended to be the complete description of the individual • Example: Suppose you have to find a number between 0 and 255, which binary representation contains the same number of 1 s and 0 s. A individual is a string of 8 bits, ex: h= 0 1 1 1 0 = 126
Fitness Function • A fitness function is a function that says how good is a solution, i. e. how well an individual fit the environment • Example note that the fitness function gets the minimum value (i. e. 0) when or and the maximum value (i. e. 8)
The Initial Population 0 1 1 1 1 0 0 0 0 0 1
Optimization • local optimum 방지 cf. Hill-climbing Method GA Search Method
Selection • Roulette wheel selection – compute each individual’s contribution to the global fitness as – The choice of the pairs for reproduction consists of randomly choosing the individuals (with replacement) with distribution given by P A B C D encoding fitness P(-) 0 1 11 1 0 4 2 . 33. 17 1 1 1 1 0 0 00 0 1 Roulette Wheel
Crossover – Randomly choose a cross over point “c”, i. e. a number between 1 and n – return two children: one composed by the first c bits of the first parent and the last n-c bits of the second parent, the other composed by the first c bits of the second parent and the n-c bits of the first parents 0 1 1 1 0 0 0 0 1 1 0 c 1 1 1 1 0 1 0 0 0 1 0
Mutation • mutation on individuals: some of the children’s bits are changed (with a small, independent probability 0 0 11 1 0 0 0 11 0 1 1 00 1 0 0 11 0 1 1 0 0 0 1 1 0 11 1 1 0 0 maximum found
Stopping Criteria • Convergence: – A population is said to converge when all the genes have converged, I. e. when the value of every bit is the same at least in the 95% of the individuals in the population • Since convergence is not guaranteed, we must consider other stopping criteria: – Number of generations – Almost constant value of the best fitting individual – Almost constant value of the average fitness of the population
Parameter Settings • Population size – How many chromosomes are in population • Too few chromosome small part of search space • Too many chromosome GA slow down – Recommendation : 20 -30, 50 -100 • Probability of crossover – How often will be crossover performed – Recommendation : 80% -95% • Probability of mutation – How often will be parts of chromosome mutated – Recommendation : 0. 5% - 1%
Genetic Programming • One of the central challenges of CS is to get a computer to do what needs to be done, without telling it how to do it – Automatic programming (or program synthesis) • GP is a branch of genetic algorithms • Main difference between GP and GA – Representation of the solution (computer program) • GA: a string of numbers – fixed-length character strings • GP: computer program (lisp or scheme) – Represent hierarchical computer programs of dynamically varying sizes and shapes
- Slides: 16