An Introduction to Evolutionary Multiobjective Optimization Algorithms Karthik

An Introduction to Evolutionary Multiobjective Optimization Algorithms Karthik Sindhya, Ph. D Postdoctoral Researcher Industrial Optimization Group Department of Mathematical Information Technology Karthik. sindhya@jyu. fi http: //users. jyu. fi/~kasindhy/

Overview • Nature Inspired Algorithms • Constraint handling • Applications

Nature Inspired Algorithms • Nature provide some of the efficient ways to solve problems – Algorithms imitating processes in nature/inspired from nature – Nature Inspired Algorithms. • What type of problems? – Aircraft wing design

Nature Inspired Algorithms • Wind turbine design BBC Performance improvement by 40%. They reduce turbulence across the surface, increasing angle of attack and decreasing drag. (Source: Popular Mechanics) • Bionic car Hexagonal plates - resulting in door paneling one-third lighter than conventional paneling, but just as strong. (Source: Popular Mechanics)

Nature Inspired Algorithms • Bullet train NATGEO Train's nose is designed after the beak of a kingfisher, which dives smoothly into water. (Source: Popular Mechanics)

Nature Inspired Algorithms for Optimization • Optimization – An act, process, or methodology of making something (as a design, system, or decision) as fully perfect, functional, or effective as possible. (http: //www. merriamwebster. com/dictionary) • Nature as an optimizer – Birds: Minimize drag. – Humpback whale: Maximize maneuverability (enhanced lift devices to control flow over the flipper and maintain lift at high angles of attack). – Boxfish: Minimize drag and maximize rigidity of exoskeleton. – Kingfisher: Minimize micro-pressure waves. • Consider an optimization problem of the form

Practical Optimization Problems – Charecteristics! • Objective and constraint functions can be nondifferentiable. • Constraints nonlinear. • Discrete/Discontinuous search space. • Mixed variables (Integer, Real, Boolean etc. ) • Large number of constraints and variables. • Objective functions can be multimodal. – Multimodal functions have more than one optima, but can either have a single or more than one global optima. • Computationally expensive objective functions and constraints.

Practical Optimization Problems – Charecteristics! Decision vector Objective vector Simulation model Optimization algorithm

Traditional Optimization Techniques – Problems! • Different methods for different types of problems. • Constraint handling e. g. using panalty method is sensitive to penalty parameters. • Often get stuck in local optima (lack global perspective). • Usually need knowledge of first/second order derivatives of objective functions and constraints.

Nature Inspired Algorithms for Optimization Nature inspired algorithms Computational intelligence Fuzzy logic systems Neural networks

Nature Inspired Algorithms for Optimization Nature inspired algorithms Evolutionary algorithms Swarm optimization Genetic algorithm Particle swarm optimization Differential evolution Ant colony optimization . . and many more.

Evolution Humans Macintosh Nokia

Evolutionary Algorithms Offsprings created by reproduction, mutation, etc. Charles Darwin Natural selection - A guided search procedure Individuals suited to the environment survive, reproduce and pass their genetic traits to offspring Populations adapt to their environment. Variations accumulate over time to generate new species

Evolutionary Algorithms • Terminologies 1. Individual - carrier of the genetic information (chromosome). It is characterized by its state in the search space, its fitness (objective function value). 2. Population - pool of individuals which allows the application of genetic operators. 3. Fitness function - The term “fitness function” is often used as a synonym for objective function. 4. Generation - (natural) time unit of the EA, an iteration step of an evolutionary algorithm.

Evolutionary Algorithms Population Individual Crossover Parents Mutation Offspring

Evolutionary Algorithms Step 1 t: = 0 Step 2 Step 3 Initialize P(t) Step 4 While not terminate do P’(t) : = variation [P(t)]; evaluate [P’(t)]; P(t+1) : = select [P’(t) U P(t)]; t : = t + 1; od Evaluate P(t) Evolutionary algorithms = Selection + Crossover + Mutation Reproduced from “Evolutionary Computation: Comments on the History and Current State” – Bäack et. al

Evolutionary Algorithms Mean approaches optimum • Variance reduces •

Evolutionary Algorithms Robust scheme Random scheme Spe ciali sche zed me Efficiency Robustness = Breadth + Efficiency Problem type (Goldberg, 1989)

Evolutionary Algorithms • Selection - Roulette wheel, Tournement, steady state, etc. – Motivation is to preserve the best (make multiple copies) and eliminate the worst • Crossover – simulated binary crossover, Linear crossover, blend crossover, etc. – Create new solutions by considering more than one individual – Global search for new and hopefully better solutions • Mutation – Polynomial mutation, random mutation, etc. – Keep diversity in the population – 010110 → 010100 (bit wise mutation)

Evolutionary Algorithms • Tournment selection 23 30 24 24 37 24 24 11 11 9 30 9 37 9 9 11 23 11 Tournment 2 Tournment 1 37 30 Deleted from population

Evolutionary Algorithms • Roulette wheel selection (proportional selection) – Weaker solutions can survive.

Evolutionary Algorithms • Concept of exploration vs exploitation. • Exploration – Search for promising solutions • Crossover and mutation operators • Exploitation – preferring the good solutions • Selection operator • Excessive exploration – Random search. • Excessive exploitation – Premature convergence.

Evolutionary Algorithms Exploration Exploitation Good evolutionary algorithm

Evolutionary Algorithms Classical gradient based algorithms Evolutionary algorithms • Convergence to an optimal solution usually depends on the starting solution. • Most algorithms tend to get stuck to a locally optimal solution. • An algorithm efficient in solving one class of optimization problem may not be efficient in solving others. • Algorithms cannot be easily parallelized. • Convergence to an optimal solution is designed to be independent of initial population. • A search based algorithm. Population helps not to get stuck to locally optimal solution. • Can be applied to wide class of problems without major change in the algorithm. • Can be easily parallelized.

Fitness Landscapes Using traditional gradient based methods f(x) Ideal and best case Multimodal f(x) x f(x) Nightmare x Teaser x

Fitness Landscapes Using population based algorithms f(x) Ideal and best case Multimodal f(x) x f(x) Nightmare x Teaser x

History of Evolutionary Algorithms • GA: John Holland in 1962 (UMich) • Evolutionary Strategy: Rechenberg and Schwefel in 1965 (Berlin) • Evolutionary Programming: Larry Fogel in 1965 (California) • First ICGA: 1985 in Carnegie Mellon University • First GA book: Goldberg (1989) • First FOGA workshop: 1990 in Indiana (Theory) • First Fusion: 1990 s (Evolutionary Algorithms) • Journals: ECJ (MIT Press), IEEE TEC, Natural Computation (Elsevier) • GECCO and CEC since 1999, PPSN since 1990 • About 20 major conferences each year

Working cycle of a genetic algorithm • Population based probabilistic search and optimization technique based on natural genetics and Darwin’s principle of natural selection • Proposed by Prof. John Holland, University of Michigan, Ann Arbor, USA "I have more ideas than I can ever follow up on in a lifetime, so I never worry if someone steals an idea from me. “ -- John Holland, 1929 -2015 https: //www. nytimes. com/2015/08/20/science/john-henry-holland -computerized-evolution-dies-at-86. html

Working cycle of a genetic algorithm • Start Initialize a population of solutions Gen = 0 Gen ≤ Max_gen Y N End Assign fitness to all solutions in the population Reproduction Crossover Mutation Gen = Gen + 1

Working cycle of a genetic algorithm • Selection scheme (reproduction) – Select good solutions using their fitness values – Leads to mating pool consisting of good solutions probabilistically – Mating pool may contain multiple copies of a particularly good solution – Size of mating pool is kept equal to that of the population of solutions considered before reproduction – Average fitness of the mating pool is expected to be higher than that of pre-reproduction population of solutions – Schemes • Ruolette wheel • Tournament selection • Ranking selection

Working cycle of a genetic algorithm • Crossover – Mating pairs are selected at random from the mating pool – New solutions by crossover with a crossover probability – Exchange of properties between the parents and new solutions are created – Parents are good, children are expected to be good – Various types of crossover: • • • Single-point Two-point Multi-point Uniform SBX

Working cycle of a genetic algorithm • Mutation – Sudden change of parameter – In GA, local change around the current solution – If a solution gets stuck at the local minimum, helps to come out of this situation http: //www. dewebsite. org/whatis_pics/mutation 2. jpg • Termination – Maximum number of generations – Desired accuracy in the solution

Binary-Coded GA •

Binary-Coded GA • http: //www. conservapedia. com/images/thumb/1/1 c/763. jpg/400 px-763. jpg

Binary-Coded GA •

Binary-Coded GA • 2 – point crossover

Binary-Coded GA • Uniform crossover

Binary-Coded GA • More crossover points more disruption • Large search space, uniform crossover is found to perform better than both the single-point and two-point crossovers • Step 5: Mutation

Binary-Coded GA •

Binary-Coded GA • Tournament selection – Tournament size n (say 2 or 3), small number compared to population size, N. – Pick n strings from the population, at random and determine the best one in terms of fitness value – Best string goes to mating pool and the n strings back to population – N tournaments are to be played to make the size of mating pool equal to N. – Interesting read “A comparative analysis of selection schemes used in genetic algorithms”

Constraint Handling • Penalty parameter-less approach – A feasible solution is preferred to infeasible solution – When both solutions feasible, choose the solution with better function value – When both solutions are infeasible, choose the solution with lower constraint violation

Constraint Handling • Box constraints – If variable is lower/higher than lower/upper bound, • set to lower/upper bound • A random value inside the bounds

Limitations of Evolutionary Algorithms • No guarantee of finding an optimal solution in finite time – Asymptotic convergence • Containing a number of parameters – Sometimes the result is highly dependent on the parameters set – Self-adaptive parameters are commonly used • Computationally very expensive – Metamodels of functions are commonly used

Applications • Application 1 – Tracking suspect • Caldwell and Johnston, 1991 • Objective function: fitness rating on a nine point scale

Applications • • Optimization (Min/Max) of functions Airfoil optimization Evolving optimal structure Games

Evolutionary Multi-objective Optimization – A Big Picture Karthik Sindhya, Ph. D Postdoctoral Researcher Industrial Optimization Group Department of Mathematical Information Technology Karthik. sindhya@jyu. fi http: //users. jyu. fi/~kasindhy/

Objectives The objectives of this lecture are to: 1. Discuss the transition: Single objective optimization to Multi-objective optimization 2. Review the basic terminologies and concepts in use in multi-objective optimization 3. Introduce evolutionary multi-objective optimization 4. Goals in evolutionary multi-objective optimization 5. Main Issues in evolutionary multi-objective optimization

Reference • Books: – K. Deb. Multi-Objective Optimization using Evolutionary Algorithms. Wiley, Chichester, 2001. – K. Miettinen. Nonlinear Multiobjective Optimization. Kluwer, Boston, 1999.

Transition Minimize: Cost Single objective: Maximize Performance Maximize: Performance

Basic terminologies and concepts • Multi-objective problem is usually of the form: Minimize/Maximize f(x) = (f 1(x), f 2(x), …, fk(x)) subject to gj(x) ≥ 0 Multiple objectives, hk(x) = 0 constraints and decision variables x. L ≤ x. U Decision space Objective space

Basic terminologies and concepts – solution a dominates solution b, if • a is no worse than b in all objectives • a is strictly better than b in at least one objective. 5 f 2 (minimize) • Concept of nondominated solutions: 1 2 3 4 2 3 2 4 5 6 f 1 (minimize) • 3 dominates 2 and 4 • 1 does not dominate 3 and 4 • 1 dominates 2

Basic terminologies and concepts • Properties of dominance relationship – Reflexive: The dominance relation is not reflexive. • Since solution a does not dominate itself. – Symmetric: The dominance relation is not symmetric. • Solution a dominates b does not mean b dominated a. • Dominance relation is asymmetric. • Dominance relation is not antisymmetric. – Transitive: The dominance relation is transitive. • If a dominates b and b dominates c, then a dominates c. • If a does not dominate b, it does not mean b dominates a.

Basic terminologies and concepts • Finding Pareto-optimal/non-dominated solutions – Among a set of solutions P, the non-dominated set of solutions P’ are those that are not dominated by any member of the set P. • If the set of solutions considered is the entire feasible objective space, P’ is Pareto optimal. – Different approaches available. They differ in computational complexities. • Naive and slow – Worst time complexity is 0(MN 2). • Kung et al. approach – O(Nlog. N)

Basic terminologies and concepts • Kung et al. approach 5 • Ascending order for minimization objective 2 P = {5, 1, 3, 2, 4} f 2 (minimize) – Step 1: Sort objective 1 based on the descending order of importance. 1 2 3 4 3 5 2 4 5 f 1 (minimize) 6

Basic terminologies and concepts P = {5, 1, 3, 2, 4} Front = {5} T = {5, 1, 3} {5, 1} {5} B = {2, 4} {3} Front = {5} {1} Front(P) = {5} {2} Front = {2, 4} {4}

Basic terminologies and concepts • Non-dominated sorting of population – Step 1: Set all non-dominated fronts Pj , j = 1, 2, … as empty sets and set non-domination level counter j = 1 – Step 2: Use any one of the approaches to find the non-dominated set P’ of population P. – Step 3: Update Pj = P’ and P = PP’. – Step 4: If P ≠ φ, increment j = j + 1 and go to Step 2. Otherwise, stop and declare all non-dominated fronts Pi, i = 1, 2, …, j.

Basic terminologies and concepts f 2 (minimize) 1 4 5 3 f 1 (minimize) Front 2 f 2 (minimize) Front 3 Front 1 2 f 1 (minimize)

Basic terminologies and concepts • Pareto optimal fronts (objective space) – For a K objective problem, usually Pareto front is K-1 dimensional Min-Max Max-Max Min-Min Max-Min

Basic terminologies and concepts • Local and Global Pareto optimal front – Local Pareto optimal front: Local dominance check. Objective space Decision space Locally Pareto optimal front – Global Pareto optimal front is also local Pareto optimal front.

Basic terminologies and concepts • Ideal point: – Non-existent – lower bound of the Pareto front. Objective space – Upper bound of the Pareto front. f 2 • Nadir point: Znadir Min-Min • Normalization of objective vectors: – fnormi = (fi - ziutopia )/(zinadir - ziutopia ) • Max point: – A vector formed by the maximum objective ε function values of the entire/part of objective space. – Usually used in evolutionary multi-objective optimization algorithms, as nadir point is difficult to estimate. – Used as an estimate of nadir point and updated as and when new estimates are obtained. Zmaximum Zideal Zutopia ε f 1

Basic terminologies and concepts • What are evolutionary multi-objective optimization algorithms? – Evolutionary algorithms used to solve multiobjective optimization problems. • EMO algorithms use a population of solutions to obtain a diverse set of solutions close to the Pareto optimal front. Objective space

Basic terminologies and concepts • EMO is a population based approach – Population evolves to finally converge on to the Pareto front. • Multiple optimal solutions in a single run. • In classical MCDM approaches – Usually multiple runs necessary to obtain a set of Pareto optimal solutions. – Usually problem knowledge is necessary.

Goal in evolutionary multi-objective optimization • Goals in evolutionary multi-objective optimization algorithms – To find a set of solutions as close as possible to the Pareto optimal front. – To find a set of solutions as diverse as possible. – To find a set of satisficing solutions reflecting the decision maker’s preferences. • Satisficing: a decision-making strategy that attempts to meet criteria for adequacy, rather than to identify an optimal solution.

Goal in evolutionary multi-objective optimization Objective space Convergence Diversity

Goal in evolutionary multi-objective optimization Objective space Convergence

Goal in evolutionary multi-objective optimization • Changes to single objective evolutionary algorithms – Fitness computation must be changed – Non-dominated solutions are preferred to maintain the drive towards the Pareto optimal front (attain convergence) – Emphasis to be given to less crowded or isolated solutions to maintain diversity in the population

Goal in evolutionary multi-objective optimization • What are less-crowded solutions ? – Crowding can occur in decision space and/or objective phase. • Decision space diversity sometimes are needed – As in engineering design problems, all solutions would look the same. Objective space Min-Min Decision space

Main Issues in evolutionary multi-objective optimization • How to maintain diversity and obtain a diverse set of Pareto optimal solutions? • How to maintain non-dominated solutions? • How to maintain the push towards the Pareto front ? (Achieve convergence)

EMO History • 1984 – VEGA by Schaffer • 1989 – Goldberg suggestion • 1993 -95 - Non-Elitist methods – MOGA, NSGA, NPGA • 1998 – Present – Elitist methods – NSGA-II, DPGA, SPEA, PAES etc.

Evolutionary multi-objective algorithm design issues Karthik Sindhya, Ph. D Postdoctoral Researcher Industrial Optimization Group Department of Mathematical Information Technology Karthik. sindhya@jyu. fi http: //users. jyu. fi/~kasindhy/

Objectives The objectives of this lecture are to: • Address the design issues of evolutionary multi-objective optimization algorithms – Fitness assignment – Diversity preservation – Elitism • Explore ways to handle Constraints

References • K. Deb. Multi-Objective Optimization using Evolutionary Algorithms. Wiley, Chichester, 2001. • E. Zitzler, M. Laumanns, S. Bleuler. A Tutorial on Evolutionary Multiobjective Optimization, in Metaheuristics for Multiobjective Optimisation, 3 -38, Springer-Verlag, 2003.

Algorithm design issues • The approximation of the Pareto front is itself multi-objective. – Convergence: Compute solutions as close as possible to Pareto front quickly. – Diversity: Maximize the diversity of the Pareto solutions. • It is impossible to describe – What a good approximation can be for a Pareto optimal front. – Proximity to the Pareto optimal front.

Fitness assignment • Unlike single objective, multiple objectives exists. – Fitness assignment and selection go hand in hand. • Fitness assignment can be classified in to following categories: – Scalarization based • E. g. Weighted sum, MOEA/D – Objective based • VEGA – Dominance based • NSGA-II

Fitness assignment • Scalarization based (Aggregation based): – Aggregate the objective functions to form a single objective. – Vary the parameters in the single objective function to generate multiple Pareto optimal solutions. Parameters weights f 1(x), f 2(x), …, fk(x) w 1 f 1(x) + w 2 f 2(x), …, wkfk Or, max(wi(fi - zi )) F

Fitness assignment • Advantages – Weighted sum – Easy to understand implement. – Fitness assignment is computationally efficient. – If time available is short can be used to quickly provide a Pareto optimal solution. • Disadvantages - Weighted sum – Non-convex Pareto optimal fronts cannot be handled.

Fitness assignment • Objective based – Switch between objectives in the selection phase. • Every time an individual is chosen for reproduction, a different objective decides. – E. g. Vector evaluated genetic algorithm (VEGA) proposed by David Schaffer. • First implementation of an evolutionary multi-objective optimization algorithm. • Subpopulations are created and each subpopulation is evaluated with a different objective. Mating pool Population f 1 f 2 f 3 Selection New population Reproduction

Fitness assignment • Advantages – Simple idea and easy to implement. – Simple single objective genetic algorithm can be easily extended to handle multi-objective optimization problems. – Has tendency to produce solutions near the individual best for every objective. • An advantage when this property is desirable. • Disadvantages – Each solution is evaluated only with respect to one objective. • In multi-objective optimization algorithm all solutions are important. – Individuals may be stuck at local optima of individual objectives.

Fitness assignment • Dominance based – Pareto dominance based fitness ranking proposed by Goldberg in 1989. • Different ways – Dominance rank: Number of individuals by which an individual is dominated. • E. g. MOGA, SPEA 2 – Dominance depth: The fitness is based on the front an individual belongs. • NSGA-II – Dominance count: Number of individuals dominated by an individual. • SPEA 2

Fitness assignment 0 4 1 1 0 4 0 2 0 Dominance rank Dominance count 3 2 1 Dominance depth 2

Diversity preservation • Chance of an individual being selected – Increases: Low number of solutions in its neighborhood. – Decreases: High number of solutions in its neighborhood. • There at least three types: – Kernel methods – Nearest neighbor – Histogram

Diversity preservation • Kernel methods: – Sum of f values, where f is a function of distance. – E. g. NSGA f f f • Nearest neighbor – The perimeter of the cuboid formed by the nearest neighbors as the vertices. – E. g. NSGA-II i-1 i i+1

Diversity preservation • Histogram – Number of elements in a hyperbox. – E. g. PAES

Elitism • Elitism is needed to preserve the promising solutions No archive strategy Old population Offspring New Archive New population Archive

Constraint handling • Penalty function approach – For every solution, calculate the overall constraint violation, OCV (sum of Constraint violations). – Fm(xi) = fm(xi) + OCV • Solution - (xi), fm(xi)- mth objective value for xi, , Fm(xi) – Overall mth objective value for xi. • OCV is added to each of the objective function values. • Use constraints as additional objectives – Usually used when feasible search space is very narrow.

Constraint handling • Deb’s constraint domination strategy – A solution xi constraint dominates a solution xj, if any is true: • xi is feasible and xj is not. • xi and xj are both infeasible, but xi has a smaller constraint violation. • xi and xj are feasible and xi dominates xj. – Advantages: • Penalty less approach. • Easy to implement and clearly distinguishes good from bad solutions. • Can handle if population has only infeasible solutions. – Disadvantages: • Problem to maintain diversity of solutions. • Slightly infeasible and near optimal solutions are not preferred over feasible solutions far from optima.

Non-dominated Sorting Genetic Algorithm (NSGA-II) Karthik Sindhya, Ph. D Postdoctoral Researcher Industrial Optimization Group Department of Mathematical Information Technology Karthik. sindhya@jyu. fi http: //users. jyu. fi/~kasindhy/

Objectives The objectives of this lecture is to: • Understand the basic concept and working of NSGA-II • Advantages and disadvantages

Reference • K. Deb, S. Agarwal, A. Pratap, and T. Meyarivan. A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2): 182– 197, 2002.

NSGA-II • Non-dominated sorting genetic algorithm –II was proposed by Deb et al. in 2000. • NSGA-II procedure has three features: – It uses an elitist principle – It emphasizes non-dominated solutions. – It uses an explicit diversity preserving mechanism

NSGA-II • NSGA-II Crossover & Mutation ƒ 2 ƒ 1

NSGA-II • Crowding distance – To get an estimate of the density of solutions surrounding a particular solution. • Crowding distance assignment procedure – Step 1: Set l = |F|, F is a set of solutions in a front. Set di = 0, i = 1, 2, …, l. – Step 2: For every objective function m = 1, 2, …, M, sort the set in worse order of fm or find sorted indices vector: Im = sort(fm).

NSGA-II • Step 3: For m = 1, 2, …, M, assign a large distance to boundary solutions, i. e. set them to ∞ and for all other solutions j = 2 to (l-1), assign as follows: i-1 i i+1

NSGA-II • Crowded tournament selection operator – A solution xi wins a tournament with another solution xj if any of the following conditions are true: • If solution xi has a better rank, that is, ri < rj. • If they have the same rank but solution xi has a better crowding distance than solution xj, that is, ri = rj and di > dj. Objective space

NSGA-II • Advantages: – Explicit diversity preservation mechanism – Overall complexity of NSGA-II is at most O(MN 2) – Elitism does not allow an already found Pareto optimal solution to be deleted. • Disadvantage: – Crowded comparison can restrict the convergence. – Non-dominated sorting on 2 N size.