2. The Mechanics of a Genetic Algorithm
The successful implementation of a GA depends on several critical design choices and mechanical components that guide the evolutionary process.
Genotype Representation
The method used to represent solutions is one of the most important decisions, as an improper representation can lead to poor performance. Representation is highly problem-specific.
- Binary Representation: One of the simplest and most common methods, where the genotype is a string of bits. It is a natural fit for problems with Boolean decision variables, like the 0/1 Knapsack problem. For numerical problems, numbers can be converted to binary, though Gray Coding is sometimes used to mitigate issues where a single bit flip causes a massive change in value.
- Real-Valued Representation: Used for problems with continuous variables, where genes are represented as floating-point numbers.
- Integer Representation: Suitable for problems with discrete-valued genes that are not simply binary. For example, directions {North, South, East, West} could be encoded as {0, 1, 2, 3}.
- Permutation Representation: The ideal choice for problems where the solution is defined by an order of elements, such as the Travelling Salesperson Problem (TSP).
Population Management
- Initialization: The initial population can be generated in two primary ways:
- Random Initialization: The population is filled with completely random solutions. This is crucial for diversity.
- Heuristic Initialization: The population is “seeded” with a few good solutions derived from a known heuristic for the problem. The remainder of the population is filled randomly, as it has been observed that random solutions are the primary drivers toward optimality.
- Population Models:
- Steady State (Incremental GA): In each iteration, one or two new offspring are created and replace one or two individuals in the existing population.
- Generational: In each iteration, a full cohort of n offspring is created (where n is the population size), and this new generation completely replaces the old one.
The Fitness Function
The fitness function evaluates how “good” a candidate solution is. Its design is critical to the GA’s performance. It should possess the following characteristics:
- Sufficiently fast to compute, as it is called repeatedly.
- Quantitatively measure the fitness of a given solution.
In many cases, the fitness function is identical to the problem’s objective function. For complex problems with multiple objectives or constraints, a designer may create a more nuanced fitness function.
Parent Selection Strategies
This process selects which individuals will reproduce. It is crucial for the GA’s convergence rate, but care must be taken to maintain diversity and avoid “premature convergence,” where one extremely fit solution dominates the population.
- Fitness Proportionate Selection: Each individual’s probability of being selected is proportional to its fitness. Fitter individuals have a higher chance of being chosen. This method does not work for negative fitness values.
- Roulette Wheel Selection: A conceptual wheel is divided into slices, with the size of each slice proportional to an individual’s fitness. The wheel is spun to select a parent.
- Stochastic Universal Sampling (SUS): An improvement on the roulette wheel, using multiple fixed points to select all parents in a single spin, which encourages highly fit individuals to be chosen at least once.
- Tournament Selection: A number of individuals (K) are chosen at random from the population, and the fittest among them is selected as a parent. This method can work with negative fitness values.
- Rank Selection: Individuals are ranked according to their fitness. Parent selection is based on rank, not the raw fitness value. This is useful when fitness values are very close, as it maintains selection pressure. It also works with negative fitness values.
- Random Selection: Parents are selected randomly without regard to fitness. This strategy is usually avoided as it applies no selection pressure.
Crossover (Recombination)
Analogous to biological reproduction, crossover combines the genetic material of two or more parents to create one or more offspring. It is typically applied with a high probability, pc.
- One-Point Crossover: A single random crossover point is chosen, and the tails of the two parent chromosomes are swapped.
- Multi-Point Crossover: A generalization where alternating segments between multiple crossover points are swapped.
- Uniform Crossover: Each gene is treated separately. A coin is flipped for each gene to decide which parent it will be inherited from.
- Whole Arithmetic Recombination: Used for integer representations, this creates children by taking a weighted average of the two parents.
- Davis’ Order Crossover (OX1): A specialized operator for permutation-based representations (like in TSP) designed to transmit information about the relative ordering of genes.
Mutation (Exploration)
Mutation is a small, random alteration in a chromosome used to introduce and maintain genetic diversity in the population. It is the primary operator for exploring the search space and is considered essential for convergence. It is applied with a low probability, pm, to avoid devolving into a pure random search.
- Bit Flip Mutation: For binary representations, one or more random bits are flipped from 0 to 1 or vice versa.
- Random Resetting: For integer representations, a randomly chosen gene is assigned a new random value from the set of permissible values.
- Swap Mutation: For permutation representations, two positions on the chromosome are chosen at random, and their values are interchanged.
- Scramble Mutation: A subset of genes is chosen, and their values are shuffled randomly.
- Inversion Mutation: A subset of genes is selected, and the sequence within that subset is inverted.
Survivor Selection and Termination
- Survivor Selection: This policy determines which individuals (parents or offspring) are kept for the next generation.
- Elitism: A common strategy where the fittest member of the current population is guaranteed to be propagated to the next generation.
- Age-Based Selection: An individual is removed from the population after a fixed number of generations, regardless of its fitness.
- Fitness-Based Selection: The newly generated children replace the least fit members of the population.
- Termination Criteria: The condition that stops the GA’s run is crucial. Common conditions include:
- No improvement in the population’s best fitness for a set number of generations (X).
- Reaching a predefined absolute number of generations.
- The objective function value reaching a specific target.