Computing and Information Systems
Volume 3 Number 1

Evolution as Learning

Colin Fyfe

We investigate a recently developed abstraction of Genetic Algorithms in which a population of GAs in any epoch is represented by a single vector whose elements are the probabilities of the corresponding bit positions being equivalent to 1. The process of evolution is represented by learning the elements of the probability vector. In this paper we extend this to the use of such methods in representing homeotic genes which are environmentally driven and turn on and off other genes. We use this method to simultaneously optimise conflicting criteria within a single population.


Modern Artificial Intelligence(AI) is composed of several strands which have the common properties of being distributed and adaptive. The aim is to create robust software solutions to those problems which traditional AI has found impossible. This common ground within current AI research has led some researchers to attempt to find a common underlying rationale to the various methods. We investigate a method which codes an evolutionary algorithm in terms of the Artificial Neural Network (ANN) method of competitive learning.

Genetic algorithms are widely used in function approximation problems. Much attention has recently been given to the problem of learning solutions in non-stationary environments(e.g. [4,8]). It is usually found that the rate of change in the environment determines the success of the algorithm in finding optimal solutions: the less stable the environment, the poorer the optimisation results. GAs which use diploid genes have shown some success in problems whose parameters do not change too quickly.

In addition, there is the problem of multi-objective optimisation when such optimisation involves conflicting criteria. This type of problem is of growing importance in e.g. classifier systems which require a set of rules which must be collectively optimised to perform a task. It is well known that, with a finite population, the simple GA settles on a single optimum. Strategies such as fitness sharing, mating restrictions and progressively setting preferences have been shown to be effective but each requires some form of global information outwith the confines of the simple GA.

We review the Structured Genetic Algorithm (SGA) which was shown to substantially outperform the simple GA in tasks where the environment changed abruptly from one steady state to another. We apply the new method of Population-based Incremental Learning(PBIL) to the SGA and derive an algorithm which will learn optimal solutions to 2 different problems even when the problems are presented to the algorithm in a random manner i.e. with no steady-state period of learning one problem followed by a state of learning the other. We further extend the method to deal with multi-modal function approximateion using standard problems.

Our extension is based on applying standard ANN methods to the PBIL algorithm.


A recent innovation in genetically-motivated optimisation algorithms is the Population-based Incremental Learning(PBIL)[1] algorithm which abstracts out the GA operations of crossover and mutation and yet retains the stochastic search elements of the GA. The algorithm is

The process is initialised with a probability vector each of whose elements is 0.5 and terminated when each element of the vector approaches 1 or 0. The update of the probability vector's elements is done using a supervised learning method

 pi =  ( Ebest(chromosomei ) - pi )

where Ebest(chromosomei ) is the mean value of the ith chromosome bit taken over the fittest chromosome(s) and pi is the probability of a 1 in position i in the current generation.  is a learning rate. We note that this representation of evolution uses a learning rule for the probability vector which is very similar to the competitive learning rules in Artificial Neural Networks.

Therefore to transmit information from generation to generation we have a single vector - the probability vector - from which we can generate instances (chromosomes) from the appropriate distribution. Notice that

 pi -> 0 => pi -> Ebest( chromosomei )

Therefore if all the fittest chromosomes in each generation have a 1(0) in the ithposition, the probability vector will also tend to have a 1(0) in that position.


A number of extensions to the basic GA have been proposed; they are often based on the fact that real biological material has great redundancy within it - it has been estimated that only a small fraction of genetic material is really used in creating a new individual (a figure of 80% of all genetic material being junk has been quoted). Also biologists have long realised that as cells differentiate they switch some genes on and others off making it possible for a single fertilised egg to self-organise to create a new bird, bee or baboon. A gene which switches other genes off and on is known as a homeotic gene. We will be most interested in environmentally-driven homeotic genes which are known to occur in nature and determine such major phenotypical properties as the organism's sex or number of wings [2].

The Structured Genetic Algorithm [3] envisaged this structure as a two-layered genetic algorithm such as shown in Figure 1.

Genes at either level can be either active or passive; high level genes either activate or deactivate the sets of lower level genes i.e. only if gene a1 is active will bits (a11 a12 a13) determine a phenotype. Therefore a single change in a gene at a high level represents multiple changes at the second level in terms of genes which are active. Genes which are not active (passive genes) do not, however, disappear since they remain in the chromosome structure and are carried invisibly to subsequent generations with the individual's string of genes. Therefore a single change at the top level of the network can create a huge difference in the expressed phenotype whereas with the simple GA we would require a number of changes in the chromosome to create the same effect. As the number of changes required increases, the probability of such a set of changes becomes vanishingly small. Note one drawback of the algorithm is that we must ensure that one and only one of the top level genes is ``on'' at any one time.

Figure 1: The Structured GA


The knapsack problem is a well-known test for optimisation algorithms. Consider a knapsack which can hold at most W kg. We wish to load the knapsack with as many objects as we can in order to maximise their value subject to this weight constraint. Then, letting vi be the value and wi the weight of the ith object, we wish to maximise  vi subject to the constraint  wi < W.

In the simple knapsack problem, we are only allowed one of each type of object and so we can model this as a binary string which has value 1 in the ith bit if object i is selected for the knapsack and 0 otherwise. An example 7-object knapsack problem is shown in Table 1.

Object Value Weight Best vector (W=50)
2 20 10 0
4 37 19 1
5 7 4 0
6 5 3 0
7 10 6 0
Total18893 2
Table 1: The weights and values of 7 objects and the optimal objects held when the total weight is limited to 50

The knapsack problem is characterised by having optima which are often far apart in Hamming distance and so is a difficult problem for calculus-based methods.

1 The figure of 93kg was chosen as such a weight enables all objects to be placed in the knapsack i.e. the solution completely dominates the solution for the 50kg criterion


The evaluation of fitness in this problem is extremely simple: if the weight constraint has been violated we give the chromosome zero fitness; otherwise the fitness value of the chromosome is the value of the objects which it has loaded into the string. Many knapsack solutions consist of isolated peaks. e.g. the string 1001000 shown above has fitness (value) = 107 whereas the next best string 1100011 has fitness 105. Notice that the Hamming distance between these strings is 4 or over half the maximum distance across the space.

The Structured Genetic Algorithm has been shown to perform well[4] in dynamic situations i.e. when the environment is changing within the course of an experiment.

For example, in the knapsack problem let us allow the simulation to converge (so that the population has converged on some solution of the problem (maximisation of value) under the constraint that the weight in the knapsack does not exceed 50Kg). Now change the constraint so that individuals must now optimise their value under the constraint that the weights do not exceed 93Kg1. If we allow such changes to happen regularly, the simple GA will not perform well since the individuals of the population have been optimised for one environment and will not perform well in a different environment. The Structured GA, on the other hand, retains enough genetic diversity within its string to quickly converge to optimal values. Further if we set up an experiment so that the the environment changes systematically and regularly, the structured GA will quickly adapt to these conditions since even the phenotypes which are fittest with respect to the 50Kg constraints may retain (as passive genes) genetic material which would make their offspring fit with respect to the 93Kg constraints (and recall that these genes can be switched on with a single change at the top level) and vice-versa.


The PBIL algorithm provides us with a method for adjusting the elements of the probability vector corresponding to a simple GA. We augment the method to mirror the SGA structure.

Let us consider a vector in which element p0 corresponds to the homeotic gene; we consider it to be a simple switch but with a stochastic element in that the value of p0 is the probability that the phenotype corresponding to gene 1 (in bit positions p1 to pm ) being expressed and 1- p0 is the probability that that corresponding to gene 2 in bit positions pm+1 to p2m is expressed. Then we generate a random number from the uniform distribution in [0,1] and if it exceeds p0 evaluate the fitness of the gene corresponding to positions p1 to pm otherwise we evaluate the fitness of the gene corresponding to positions pm+1 to p2m. The learning rule is

 p0 =  0 ( Ebest(chromosome0 ) - p0 )

 pi =  ( Ebest(chromosomei ) - pi ) * Ebest(chromosome0 ), 1 < i < m

 pi =  ( Ebest(chromosomei ) - pi ) * ( 1 - Ebest(chromosome0 )), m+1 < i < 2m

where  and  0 correspond to small learning rates which can be adjusted during the course of the simulation. The last two equations are similar to the equation for the PBIL. The sole difference is that each change in the probabilities is gated by the expected value that the appropriate gene's phenotype was expressed during that generation.

Notice that the fact that different genes are using different rules is not implausible from a biological perspective; in particular homeotic genes are believed to be relatively more stable in most cases than other expressed genes since their effect can be so major.

The first equation causes the homeotic gene's probabilities to be changed. Notice that unless this gene's probabilities reach 0 or 1, either of the two "lower-level" genes has a non-zero probability of being expressed. It is simple to ensure that this is always the case by putting bounds on the maximum and minimum values of p0 , however it has been found that a better solution is to have a lower learning rate  0 for this gene which allows it to track the environment but at a slower rate.


The SPBIL can be shown, like the SGA, to be valuable in dynamic environments. We give an example of its use in optimising two related problems consecutively. Since the problems are related we expect interference between the solutions. We use the same problem as above (with constraints 50kg and 93 kg) but at each generation evaluate the convergence of the network with respect to finding optimal solutions with respect to one or other criterion. Our experiment created 100 individuals from the probability vector in each generation and used only the best single individual in each generation to modify the probability vector. The learning rates,  =0.1,  0 =0.01 were as defined above.

The average fitness of each chromosome is shown in Figure 2. A comparison with the single chromosome PBIL algorithm shows that the SPBIL algorithm outperforms it on this task. We can see the reason for this if we look at the best realised chromosome in each generation, we see that this chromosome exists in each generation (see Figure 3).

Figure 2: Average fitness of the population over 1000 generations.

Figure 3: Fitness of the best gene over 1000 generations.

SIMULTANEOUS OPTIMISATION A further advantage of the SPBIL is that we can perform simultaneous optimisation of more than one criterion.: let us attempt to simultaneously optimise the knapsack value under the pair of criteria used in the last section by randomly in each generation chosing which of the criteria to use in fitness evaluation.

We show, in Figure 4 the convergence of the first two bit positions in each of gene 1 and gene 2 towards their final value of 1. We can clearly see the interference between the two criteria's respective populations. This time each niche population is being formed at the same time as the other: the actual weight criterion for which the value was optimised was selected randomly each generation from the possible values. This corresponds to a dynamically changing environment where the measure of success in the environment is chosen randomly. We see here a simultaneous optimisation of both criteria.

Figure 4: Probabilities of the first bit position of each of the realisable genes. Typically gene 1 is optimised first while the second finds a niche when the first is too specialised.

The number of individuals in each generation who reached the maximal evaluation for the same simulations is shown in Figure 5. Notice the speed of convergence of the algorithm can be adjusted by altering both the learning rate and the number of fittest individuals in each generation which are used to amend the probability vector.


We now extend the use of the SPBIL on a representative of the set of problems of multiple competing criteria. We use as an example of such problems the first of Deb's test problems quoted in [8]. Consider the function

F1(x) = sin6 (5 3.14159.. x)

This function has 5 peaks when defined for -1 <= x <= 1 which occur at 0.1,0.3,0.5,0.7 and 0.9 as shown in Figure 6. We follow Ryan's[8] coding: a 30-bit chromosome is used to represent x. Ryan follows the usual practise of initialising the chromosomes equally within the space of possible values, however such a method is not guaranteed to work since, in real problems, several solutions can be arbitrarily close to one another. Also, as the dimensionality of the data increases, the method's expense increases exponentially. We simply wish to begin with the probability vector of all the realisable genes (the lowest level genes) initialised with a common probability of 0.5. Further we do not wish to build in pre-knowledge of the solution in that the number of realisable genes are equal to the number of optimal solutions (5 in this case). One possible extension to the SPBIL is shown in Figure 7. We now consider a multi-layered sequence of homeotic genes.

Figure 5: The number of individuals in a population of 100 during each generation who reached the peak fitness level.

Figure 6: The graph of F1(x) = sin6 (5 3.14159.. x)

Figure 7: Homeotic gene 0 determines whether homeotic genes 1 or 2 will be realised. They in turn determine which of genes 3-6 will be realised. Genes 3-6 determine which of the lowest level genes will be realised.

The top level represents bit 0 which determines whether bit 1 or bit 2 is turned on. Both bit 1 and bit 2 are further homeotic genes so that bit 1 determines whether bits 3 or 4 are turned on and bit 2 determines whether bits 5 or 6 are turned on. Similarly bits 3 - 6 are also homeotic genes and determine which of the realisable genes (each of 30 bits) are activated. There are therefore 8 sets of realisable genes each of which may determine the fitness of the chromosome. The learning rules are as above:

 p0 =  0 ( Ebest(chromosome0 ) - p0 )

 p1 =  1 ( Ebest(chromosome1 ) - p1 ) * Ebest(chromosome0 )

 p2 =  1 ( Ebest(chromosome2 ) - p2 ) * (1-Ebest(chromosome0 ))

 p3 =  2 ( Ebest(chromosome3 ) - p3 ) * Ebest(chromosome1 )

 p4 =  2 ( Ebest(chromosome4 ) - p4 ) * (1-Ebest(chromosome1 ))

 pi =  3 ( Ebest(chromosomei ) - pi ) * Ebest(chromosome3 ), 8 <= i < 38

 pi =  3 ( Ebest(chromosomei ) - pi ) * (1-Ebest(chromosome3 )), 38 <= i < 68

where all expectations, E(), are taken over the fittest chromosomes. The obvious extension is made to the other genes. We use lower learning rates for the higher level homeotic genes. With this set of learning rules, we found that convergence to the optimal solution was possible but that it required a very low learning rate i.e. there is a very slow convergence to the optima. If the learning rate increases a subset of the probability vectors converge to the optimal values but since that subset will typically have very accurate realisations on each generation, the whole population will become dominated by the niche solution which is most accurate. This fairly quickly feeds back to the homeotic genes and predisposes them to always select the most fit sub-population.

In order to maintain diversity in the population, we introduce an environmental feedback mechanism:
when an instance of the realised genes is one of the fittest chromosomes i.e. one used in upgrading the probability vector, we added a small bias which is used to handicap the gene when its fitness is next calculated. This models the fact that competition for resources in a population is greatest between similar individuals. So in each generation we calculate

Delta biasi = - 0.016, for the best genes

Delta biasi = +0.002, All i

biasi = biasi + Delta biasi

fitness = F1(x) + biasi

Using this method we can have an extremely high learning rate and accurate convergence to all optima. At the optima, probability elements representing the homeotic genes will be approximately 0.5 while all probability elements representing realisable genes will either be tending to 0 or 1. Using a training run of only 200 iterations and a learning rate  3 of 0.2 we have the results of Table 2.

Gene 1 2 3 4
Value 0.0010.5000.3000.700
Gene 5 6 7 8
Value 0.3000.9020.9000.900

Table 2: Converged values represented by the probability vectors after only 100 generations when the learning rate,  was 0.2

The similarity between the use of a bias and that method known as a "conscience mechanism" in Artificial Neural Networks is obvious.


Now we wish to extend the above method to cope with Deb's other functions: consider his F2() function shown graphically below.

Figure 8: Deb's F2 function has decreasing peaks. The fitness function is F2(x) = exp(-2log(2) * (x-0.1)2/ 0.64) * sin6 (5  x)

Function F2(x) is defined by

F2(x) = exp(-2log(2 ) * (x-0.1)2/ 0.64) * sin6 (5  x)

Experimental findings have shown that with the system defined above the fitness of the highest peak is predominant: all realised genes converge to that peak leaving the other peaks unexplored. What we wish to do is to allow only one niche population to occupy the highest peak and force the other populations to occupy the lower peaks. To do so we use a system very similar to Kohonen's Learning Vector Quantization(LVQ) [6]. At each update of the probability vector, we increase the probability vector relative to the best vector as before but also include a means of moving all (including the winning one) away from the best:

 pi = - 0 ( Ebest(chromi ) - pi ), All i

 pi =  ( Ebest(chromi ) - pi ), for the realised chromosome.

Clearly we require  0 <  or we would get no convergence to any peak. A ratio of 0.1 was found sufficient in the experiments described herein.

With this simple addition, all peaks were consistently found.

Functions F3 and F4 did provide more of a challenge to the algorithm and are shown diagrammatically below.

Figure 9: The F3() function

Figure 10: The F4() function

The F3 function has peaks of equal height but at different intervals. It is defined by

F3(x) = sin6 (5 (x -0.05))

The F4 function has unevenly spaced and decreasing peaks determined by

F4(x) = exp(-2log(2 ) * (x-0.08)2/ 0.854) * sin6 (5 x)

Functions F3 and F4 proved more difficult for the algorithm in that we often found more than one population occupying the broader-based peaks and such populations interfered with one another causing poorer accuracy in the converged results.

It was found that the simple expedient of causing the learning rate to decrease to 0 during the course of the experiment was sufficient to give any required degree of accuracy. This method is obviously suggested by the needs of many stochastic approximation algorithms such as the Robbins-Monroe stochastic approximation algorithm[7] which requires the learning rate to satisfy

(k) >= 0, k (k) = , k 2(k) <

It is interesting that such a learning rate provides a population which has found its niche - it is no longer capable of evolving which agrees with Huxley's ([5],p11) comment "there is no certain case on record of a line showing a high degree of specialization giving rise to a new type''.

We find ourselves in some difficulty in providing a numerical report of our results in that

We will have to content ourselves with stating that using the methods described in this paper we can cause convergence of the realised genes to the correct optimal values to any degree of accuracy required.


The PBIL algorithm has provided a language for discussing convergence of evolutionary algorithms in terms of learning. We have seen that the SPBIL algorithm provides a new way of thinking about the convergence of the SGA and have reviewed experiments in which a single population using a homeotic ``switching'' gene has caused divergence into two distinct populations. The SPBIL algorithm can be modified to ensure that genetic diversity always exists within the population by ensuring that no probability exceeds 0.9 or is less that 0.1; by ensuring that the homeotic gene does not reach the extreme values, we can ensure that there exist organisms in any environment which will not be overspecialised to the environment.

The SPBIL algorithm also allowed simultaneous optimisation of conflicting criteria. Simple extensions of the SPBIL were used when developing a multi-layered gene in order that the niche populations represent all optima.

Finally when we use a population of more than 1 to update the probability vector, it is possible to use a weighted expected value in the learning equations in which the weights are proportional to the fitness of the appropriate phenotypes.

Future work will investigate the convergence of structured PBIL simulations on such standard problems as the Travelling Salesman Problem.


[1] S. Baluja and R. Caruana.
Removing the genetics from the standard genetic algorithm. In Proceedings of the Twelfth International Conference on Machine Learning, 1995

[2] J. Cohen and I. Stewart.
The Collapse of Chaos, Penguin, 1995.

[3] D. Dasgupta and D.R. McGregor.
A more motivated genetic algorithm: The model and some results. Cybernetics and Systems, 25(3):447-469.

[4] D. Dasgupta.
Optimisation in time-varying environments using structured genetic algorithms. Technical report IKBS-17-93, The University of Strathclyde, 1993.

[5] T. Huxley,
Progress: Myth or Science. In Evolution Extended, Biological Debates on the Meaning of Life, C. Barlow, (Ed.)

[5] T. Kohonen.
Self-Organising Maps. Springer, 1995.

[6] J.M. Mendel.
A Prelude to Neural Networks: Adaptive and Learning Systems, Prentice Hall, 1994.

[7] C. Ryan.
Racial Harmony and function optimization in genetic algorithms - the races genetic algorithm. In Proceedings of EP95. MIT Press, 1995.

Dr. C Fyfe is a Lecturer at the University of Paisley


University of Paisley, 1996.