2.0 The Core Architecture and Terminology
A successful Genetic Algorithm implementation requires a solid understanding of its fundamental structure and the specific terminology used to describe its components. This section lays out the universal blueprint that governs the lifecycle of a GA, providing the foundational knowledge needed to design and build an effective algorithm.
Basic Terminology
- Population: A subset of all possible encoded solutions to the problem. It is a collection of candidate solutions, analogous to a population of organisms in nature.
- Chromosome: A single candidate solution to the given problem.
- Gene: A single element or position within a chromosome.
- Allele: The specific value that a gene takes for a particular chromosome.
- Genotype vs. Phenotype:
- Genotype: The encoded representation of a single chromosome in the computational space (e.g., a bit string), which is manipulated by the genetic operators.
- Phenotype: The representation of the solution in the actual, real-world problem space.
- Encoding transforms a solution from the phenotype to the genotype space, while Decoding transforms it back. For example, in the 0/1 Knapsack problem, the phenotype might be a list of item numbers to pick. The genotype, however, is a binary string where a 1 at a specific position indicates an item is picked and a 0 indicates it is not.
- Fitness Function: A function that takes a candidate solution as input and returns a quantitative measure of its suitability or “fitness.” This value determines how good the solution is.
- Genetic Operators: The mechanisms that alter the genetic composition of offspring, such as selection, crossover, and mutation.
Basic Structure
A Genetic Algorithm begins with an initial population of solutions, which can be generated randomly or “seeded” with solutions from other heuristics. From this population, parents are selected based on their fitness. These parents then undergo crossover and mutation to produce new offspring. Finally, these offspring replace existing individuals in the population, and the cycle repeats. This process is designed to mimic natural evolution, iteratively improving the quality of solutions over generations.
Generalized GA Pseudo-code
GA()
initialize population
find fitness of population
while (termination criteria is not reached) do
parent selection
crossover with probability pc
mutation with probability pm
decode and fitness calculation
survivor selection
find best
return best
The following sections will deconstruct each step of this lifecycle, exploring the critical design choices required for a practical and effective implementation.