Genetic
   Programming
     Engine

A Genetic Programming Framework for .NET
 
ABOUT
 

Sections

About Genetic Programming

Genetic Programming is an Evolutionary Algorithms technique that uses computer programs to solve problems. Due to this use of computer programs, Genetic Programming is very flexible and can be applied to practically any problem. However, the ideas behind Genetic Programming are relatively simple.

The Genetic Programming Algorithm

The basic algorithm used by Genetic Programming is illustrated below.

The Genetic Programming Algorithm
Illustration of the Genetic Programming Algorithm
Population. We begin with a Population of random computer programs. These are the Individuals, the potential solutions to our problem.

Testing. We take this Population and run each Individual in it through a test in an Environment. The results of this test are fitness values which indicate how well each Individual performed.

Selection. Using these fitness values, we select the high-performance Individuals from the current population. These Individuals will become the basis for our next Population.

Breeding. From our selected subset of the Population, we choose some number of parent Individuals. Portions of these Individuals are then recombined and possibly mutated. The resulting children are placed in the next Population and the algorithm continues.

Fitness. The key to the success of the algorithm is the fitness function. The algorithm maintains no problem-specific information about the Individuals or the Environment. All of its knowledge regarding a Population is contained in the fitness values of the Individuals.

The fitness function measures how close an Individual is to a solution. Using these values, the algorithm can differentiate between individuals during the selection process. This creates a bias towards Individuals with better fitness, which is how the algorithm improves the overall fitness of each successive Population.

The Population from one cycle of this algorithm is called a Generation, and the processing of one or more Generations is called a run. A run continues until either 1) a solution is found or 2) another stopping criterion (such as a Generation limit) is met.

Other resources

About the Genetic Programming Engine

The heart of the Genetic Programming Engine project is the GPEngine class. This class is the engine and is responsible for implementing the Genetic Programming Algorithm. It interacts with user-defined problems through three interfaces: IIndividual, IEnvironment, and IHistory.

IIndividual. This interface represents an Individual, as described in The Genetic Programming Algorithm section above. It includes standard methods used to test the Individual, the most important being the Test method. This method is overwritten by the engine with whatever combination of operations is generated for the Individual during a run.

The engine gets the operations for use in the Test method from the Base Class defined for a problem. The Base Class is a class created by the user that implements the IIndividual interface, along with any problem-specific operations available to Individuals for solving the problem. All Individuals created by the engine derive from this Base Class.

For example, in the case of the Artificial Ant problem, the Base Class (AntIndividual) defines four public methods available for derived Individuals: TurnLeft, TurnRight, MoveForward, and IsFoodAhead. As the engine runs, it will recombine these operations and place them in the Test method, creating programs like:

if( IsFoodAhead() ) {
   MoveForward();
}
else {
   TurnLeft();
}

IEnvironment. This interface represents an Environment, as mentioned in the Testing sub-section of The Genetic Programming Algorithm section above. It includes standard methods used for initializing and cleaning up after a test of an Individual.

After the Test method has been defined for each Individual in the current Population, that Population will be compiled and then tested in one or more Environments. Each Environment used by the engine is a problem-specific class that implements the IEnvironment interface. This class defines the structure of the problem being worked on. The engine also passes an initialization file to the Environment when it is being initialized. This file contains the data for one test.

For example, in the case of the Artificial Ant problem, the Environment (AntEnvironment) defines the structure of the problem as a toroidal (wrap-around) grid. The various initialization files for the Environment contain information on different grids, each defining its own size, number of food cells, and placement of those cells on the grid.

IHistory. This interface represents a record of the actions taken by an Individual in an Environment during a test. It defines one main member, the property RawFitness, which acts as the fitness function for the problem. There is a second interface, IGraphicHistory, which adds display functionality to the regular IHistory interface. This interface is used by GUIs created for the engine, not by the engine itself.

After the Individual is tested in the Environment, the History is saved and the next test set up. Once all Individuals in a Population have been tested, the Individual and Environment instances are disposed, leaving only the Histories. The RawFitness values from these Histories and their associated Individuals are then put through the selection and breeding processes to create the next Population.

For example, in the case of the Artificial Ant problem, the History (AntHistory) uses a simple RawFitness calculation where the number of food cells found by the Individual is used as its fitness. Here the goal is to maximize the amount of food collected by the Individual. Other problems, such as mapping a mathematical function to a set of data points, would use a more complicated fitness function, like the sum of the standard errors associated with each data point. Here the goal would be to minimize error.

Given classes implementing these three interfaces, the engine can be run on any problem.

About the Project

The idea for this project began back in the fall of 2004 when I read an article on Genetic Programming in .NET. This was the first I had heard about Genetic Programming and I found the concept fascinating. Still, the actual implementation was, by his own admission, a bit of a hack job (no offence :). So, ever the optimistic programmer, I asserted that I could do better ;)

Even so, it might have remained just an idea if it weren't for a new class offering by one of my favorite professors on Evolutionary Algorithms. Although the class' main focus was on Genetic Algorithms and their use in solving various problems, I thought I'd try my hand at running the problems in GP. So, with this bit of practical encouragement I set to work on “the prototype”.

Of course, having only a few weeks to work on it before the first assignment was due, this implementation was also a bit of a hack job (which is why it became “the prototype” and not the full-fledged program ;). However, when next semester rolled around and it came time to pick a project for my Software Engineering lab, I knew what I wanted to do. All of a sudden a team was created and we set to work.

And here we are, several months later: tired, but satisfied by what we've managed to accomplish in this first version of the project. So have a look, and maybe help out while you're at it ;)

~ Paul A Hansen, May 2005