JGAP Default Initialisation Configuration

I use JGAP as part of my Crunch clustering tool. For my thesis writeup, I needed to describe the specific parameters to the genetic algorithm it used. This post details how JGAP starts up, and where to find the defaults, what they are and what the values actually mean. These values are probably specific to the version I’m looking at (3.62), although they haven’t changed in roughly 4 years at the time of writing.

Startup

The search parameters are contained in a Configuration object. The configuration is initialised with various parameters and the type of chromosome to produce. The Genotype represents the population, and exposes the GA component to the user (via the evolve and getFittestIndividual methods).

Genotype

Genotypes can be bootstrapped from a Configuration via the randomInitialGenotype method. This constructs the population (which uses randomInitialChromosome to produce random individuals) and returns the initialised Genotype object.

Genotype.evolve()

evolve has three overloaded methods, the two that take arguments internally call the no-argument version.

The evolve method simply obtains an IBreeder object from the Configuration and uses it to construct a new Population, before setting the current population to the newly constructed version.

IBreeder

IBreeders coordinate the evolution step, however they do not actually perform or store any of the genetic operators. These are kept in the Configuration object’s m_geneticOperators field.

GABreeder

The default Breeder used is the GABreeder. The following stages are applied to the population:

Fitness Evaluation

The fitness of each individual in the population is first calculated, using the bulk fitness function, or via updateChromosomes if one isn’t provided.

Natural Selectors (Before Genetic Operators)

Each of the NaturalSelectors registered for application before the genetic operator chain runs is run on the population. The default configuration does not set any of these selectors.

Genetic Operators

Each of the genetic operators registered in the configuration is run, one at a time, in order of addition to the Configuration. Implementation for this step appears in BreederBase.applyGeneticOperators. The genetic operators implement an operate method, which runs over the population. These may only add to the population, however; an operator should never modify an individual in the population. A List of IChromosomes is passed as well as a Population, although this is just a reference to the Population‘s internal list.

DefaultConfiguration uses a Crossover operator, followed by a Mutation operator. Parameters to these are given at the end of this post.

Evaluation

The population is then re-evaluated, as it contains new individuals not present before the first round of selection.

Natural Selectors (After Genetic Operators)

Finally, the new population is produced by applying a NaturalSelector to it. If the selector selects too few individuals, JGAP can optionally fill the remainder with random individuals – change MinimumPopSizePercent to turn this on, but it’s off by default.

The Default JGAP Configuration

DefaultConfiguration sets up the following operator chain:

Genetic Operators

CrossoverOperator:

Rate: 35%, which translates to populationSize * 0.35 crossover operations per generation. Each crossover produces 2 individuals. Crossover point is random.

MutationOperator:

Desired Rate: 12. The rate for this one is a little different; mutation occurs if a random integer between 0 and the rate (12) is 0, thus it equates to a probability of 1/12. This operation is applied to each gene in each element of the population (excluding those produced by crossover), so the rough likelihood of a mutation being applied is (1/rate) * popSize * chromosomeSize. For each gene that is to be mutated, a copy of the whole IChromosome that contains it is made, with the selected Gene mutated via its applyMutation method.

Natural Selector

One selector is defined by default, which is the BestChromosomesSelector. This selector is given the following parameters:

Original Rate: 0.9. 90% of the population size allotted to this selector will actually be used. The selector is elitist, so this will return the top populationSize * 0.9 elements. The remaining 10% is filled by cloning the selected elements, one by one, in order of fitness until the population size reaches the limit.

Doublette Chromosomes Allowed? Yes. Doublettes are equivalent chromosomes (by equals()).

Configuration Parameters

The configuration specifies that the pool of selectors will select 100% of the population size. However, the only selector is configured to produce only 90% of the population size each time.

keepPopulationSizeConstant is also set to true, which dictates that the population is trimmed to size at each iteration. Somewhat counter-intuitively, this does not cause the population size to be increased if it falls below the user supplied value; it doesn’t enforce a minimum, only a maximum

minimumPopSizePercent is set to zero, so random individuals will never be used to fill the population if it declines below the limit (won’t ever happen under these conditions).

So What is the Default?

The default configuration is an elitist ranking selector that clones the top 90% of the user-specified population size, repeating it to fill the rest. Crossover is random point with a rate of populationSize * 0.35. Mutation is random and applied at  to 1 in 12 genes in the whole population (i.e. the rate is dictated by chromosome size * population size / 12).

Advertisements

Tracing Java Method Execution with AspectJ

ApectJ is a project to bring aspect oriented programming to Java.  Sidestepping the whole issue around whether or not aspect-oriented programming is a good idea; it can be used to insert code at points you define and is a very useful tool for dynamic program analysis.

Some terminology needs explaining here before we go further.  AspectJ defines a point in the source to stop execution and pass it to your code as a “Pointcut”, the actual code executed on this pause is termed “advice”.  Although I use the word “pause” it’s not strictly correct; AspectJ “weaves” the advice code for matching pointcuts at compile time. Additionally (and perhaps most significantly) AspectJ can weave on class load time (“Load Time Weaving”), this is what we’ll be using.

So we have a way of inserting code in places in an existing code base, what can we do with it?  A lot of horrible things (playing around with returned values and parameters to name a few); but for now we’ll stick to what we want to do, tracing method calls.

The most obvious way to instrument a program is to insert calls to some logging library at the top of every method.  Although this works it takes a lot of time, either manually logging everything or figuring out how to use a regex/parser to do it for you.  With AspectJ we can simply weave code that logs the current method at every method execution point.

So how do we do that?  First we need to get AspectJ working; Eclipse will do that for you, alternatively you can do everything from the command line (assuming you have AspectJ installed, including the ajc command on your PATH).

We’ll define an Aspect that defines a pointcut on every method execution, as well as some advice to run when they turn up when the code is executed.

package aspects;

import java.util.logging.Level;
import java.util.logging.Logger;

import org.aspectj.lang.Signature;

aspect Trace{


	pointcut traceMethods() : (execution(* *(..))&& !cflow(within(Trace)));

	before(): traceMethods(){
		Signature sig = thisJoinPointStaticPart.getSignature();
		String line =""+ thisJoinPointStaticPart.getSourceLocation().getLine();
		String sourceName = thisJoinPointStaticPart.getSourceLocation().getWithinType().getCanonicalName();
		Logger.getLogger("Tracing").log(
                Level.INFO, 
                "Call from "
                	+  sourceName
                    +" line " +
                    line
                    +" to " +sig.getDeclaringTypeName() + "." + sig.getName()
		);
	}

}

So what exactly does this do? Firstly, the pointcut traceMethods() defines a new pointcut called traceMethods. This pointcut matches execution of every method in every class, as long as the control flow isn’t in the current class (Trace). The latter constraint is to stop an infinite loop occurring.

The before(): part of the class defines advice. This is the code that gets inserted just before the execution of the method. Advice can also be given after a pointcut is hit (using the after keyword instead). Our “advice” doesn’t modify execution flow, it just logs some information about the state of the program when the pointcut was hit, but it could quite easily start modifying control flow.

Save this file as Trace.java and compile it using: ajc -outxml -outjar aspects.jar Trace.java.

This compiles the aspect and puts it into a jar ready for use. The -outxml parameter will cause ajc to automatically generate an aop.xml file and save it in the jar’s META-INF directory. The load time weaving agent will read this to determine which aspects to weave with the classes it loads. The aop.xml file can do more than that, it can be used to set namespaces to ignore when weaving (such as java.*) to avoid mucking with the control flow of libraries.

So now we have our aspects.jar how do we use it? The AspectJ weaving agent must be available somewhere on the filesystem (aspectjweaver.jar) and your aspect.jar (and the target application) must be on the classpath.

Once that’s sorted to trace the application, use

java -javaagent:<path to aspectjweaver.jar> -cp <path to aspects.jar>:<path to target jar/folder> <name of main class to run>

Hopefully that will run and you should see a large amount of console output:

INFO: Call from main.RunFile line 206 to main.RunFile.main
Mar 31, 2011 2:52:53 PM aspects.Trace ajc$before$aspects_Trace$1$b314f86e
INFO: Call from main.RunFile line 186 to main.RunFile.runList
Mar 31, 2011 2:52:53 PM main.RunFile main
INFO: Starting clustering of 0 files
Mar 31, 2011 2:52:53 PM aspects.Trace ajc$before$aspects_Trace$1$b314f86e
INFO: Call from main.ExperimentRunner line 59 to main.ExperimentRunner.runExperiments

AspectJ lets you do far more than this, it’s an understatement to say this is all it can be used for. Using LTW does make for some interesting possibilities, one obvious use would be monkey patching existing projects to fix bugs where source isn’t available (and the license permits it, of course).

AspectJ is an interesting piece of work; the official project page has far more resources on the types of pointcuts you can define and the more nitty-gritty details of making it mess with the traced program.