Coevolutionary Automated Software Correction Josh Wilkerson Ph D

  • Slides: 23
Download presentation
Coevolutionary Automated Software Correction Josh Wilkerson Ph. D Candidate in Computer Science Missouri S&T

Coevolutionary Automated Software Correction Josh Wilkerson Ph. D Candidate in Computer Science Missouri S&T

Technical Background Evolutionary Algorithms (EAs) – – – Subfield of evolutionary computation (in artificial

Technical Background Evolutionary Algorithms (EAs) – – – Subfield of evolutionary computation (in artificial intelligence) Based on biological evolution Uses mutation, reproduction, and selection Population composed of candidate solutions Needed: • Solution representation • Fitness function – Applicable to a wide variety of fields – Makes no assumptions about the problem space (ideally) Page 2

Technical Background Page 3 EA Operation – Start with an initial population – Each

Technical Background Page 3 EA Operation – Start with an initial population – Each generation • Create new individuals and evaluate them • Population competition (survival of the fittest) – Mutation and reproduction • Explore the problem space • Bring in new genetic material – Selection • Applies pressure to individuals • More fit individuals are selected for mutation and reproduction more often

Technical Background Genetic Programming – Type of EA – Evolves tree representations – E.

Technical Background Genetic Programming – Type of EA – Evolves tree representations – E. g. , computer program parse trees Coevolution – – – Extension of standard EA Fitness dependency between individuals Dependency can be either cooperative or competitive CASC system uses competitive coevolution Evolutionary arms-race Page 4

High Level View of CASC Page 5

High Level View of CASC Page 5

CASC Evolutionary Model Page 6

CASC Evolutionary Model Page 6

CASC Evolutionary Model Page 7

CASC Evolutionary Model Page 7

CASC Evolutionary Model Page 8

CASC Evolutionary Model Page 8

CASC Evolutionary Model Page 9

CASC Evolutionary Model Page 9

Reproduction Phase: Programs Randomly select a genetic operation to perform – Probability of operation

Reproduction Phase: Programs Randomly select a genetic operation to perform – Probability of operation selection is configurable Perform operation, generate new program(s) Add new individuals to population Repeat until specified number of individuals has been created Page 10

Reproduction Phase: Programs Genetic Operations – Reset – Copy – Crossover • Two individuals

Reproduction Phase: Programs Genetic Operations – Reset – Copy – Crossover • Two individuals are randomly selected based off fitness • Randomly select and exchange compatible sub-trees • Generates two new programs – Mutation • Randomly select individual based off fitness • Randomly select and change mutable node • Generate a new sub-tree (if necessary) – Architecture Altering Operations Reselection is allowed for all operators Page 11

Reproduction Phase: Test Cases Reproduction employs uniform crossover Each offspring has a chance to

Reproduction Phase: Test Cases Reproduction employs uniform crossover Each offspring has a chance to mutate Genes to mutate are selected random Mutated gene is randomly adjusted – The amount adjusted is selected from a Gaussian distribution Page 12

CASC Evolutionary Model Page 13

CASC Evolutionary Model Page 13

CASC Evolutionary Model Page 14

CASC Evolutionary Model Page 14

CASC Evolutionary Model Page 15

CASC Evolutionary Model Page 15

CASC Evolutionary Model Page 16

CASC Evolutionary Model Page 16

CASC Implementation Details Page 17 Adaptive parameter control – EAs typically have many control

CASC Implementation Details Page 17 Adaptive parameter control – EAs typically have many control parameters – Difficult to find optimal settings for these parameters – In CASC genetic operator probabilities are adaptive parameters – Rewarded/punished based on performance • If one operator is generating improved individuals more than the others make it more likely to be used – Allows the system to adapt to the different phases in the search

CASC Implementation Details Page 18 Parallel Computation – Computational complexity is generally a problem

CASC Implementation Details Page 18 Parallel Computation – Computational complexity is generally a problem for Eas – CASC writes, compiles, and executes hundreds (or even thousands) of C++ programs in a given run – To reduce run times this is done in parallel (on the NIC cluster here on campus) – Main node: responsible for generating and writing programs – Worker nodes: responsible for compiling and executing programs – Dramatically speeds up execution – Investigating new options for this (discussed later)

Current and Future Work Page 19 Fitness Function Design – For each new problem

Current and Future Work Page 19 Fitness Function Design – For each new problem CASC needs a new fitness function – Fitness function design can often be difficult – Developing a guide for fitness function design – Starts a program specifications – Walks through the thought process for designing a fitness function for the problem – Long term goal: automate fitness function creation

Current and Future Work Page 20 File system slow down – CASC is writing

Current and Future Work Page 20 File system slow down – CASC is writing and compiling many programs each run – I. e. , many files in the file system each run – File system access is bottlenecking the speed of the CASC system – Currently reworking the system to store program files and executables in RAM – Uses a virtually mounted hard disk that stored data in RAM – Expecting a dramatic speed up (fingers crossed…) – Other option: distributed computing (like BOINC, Folding@home, etc. )

Current and Future Work Page 21 Scalability – As program size increases so does

Current and Future Work Page 21 Scalability – As program size increases so does the problem space • Many more modifications possible • More genetic material – Investigating options to allow CASC to scale with problem size – Current idea: break the program up into pieces • Multiple program populations • Each population is based on a piece of the original program • Each population has its own objective • Cooperative coevolution

Current and Future Work Page 22

Current and Future Work Page 22

Page 23 Questions?

Page 23 Questions?