Optimisation and GA

Reading
Briefing
Exercises
Debrief

Reading

The textbook chapters (Haupt and Haupt) are available at BlackBoard.

Haupt and Haupt Chapter 1 (cursory)
- we will not dwell on the various optimisation algorithms discussed here
- the general background is important though
- R&N Chapter 4.1-4.2 give a briefer introduction to the same material
Haupt and Haput Chapter 2
- The main point here is the binary GA, which is summarised as Matlab code at the end of the chapter
- There are many variants of each constituent step, including cross-over, matchmaking, and mutation
Videos
1. Population based optimisation methods (population.mp4 and slides)
2. Binary representation of floating point numbers (binary.mp4 and slides)
3. GA - combining two parent states into one off-spring (ga.mp4 and slides)

Briefing

Optimisation is a general problem
- intelligent agents optimise actions
- functions are optimised in all kinds of decision making
iterative algorithms
- iteratively improving a tentative solution is easily stuck in local maxima
- convergence
- applies to all numerical methods (error minimisation)
- compare to problem solving models
- contrast to exhaustive approaches - exploritng the entire solution space is rarely tractible
Population based methods
- what is the difference between random restart hill climbing (AIMA page 131) and local beam search (AIMA page 133)?
  - what is the practical significance of this difference?
- multiple tentative solutions scattered over the search space
- reduces the risk of getting stuck in a local optimum, while avoiding exhaustive search
- many variants
- Most basic - look up R&N Chapter 4.1-4.2
Genetic algorithms
- overview of the demo implementation

Exercises

I have provided a demo implementation of GA and Local Beam Search on github. The implementations in this subdirectory all use a binary representation of floating point numbers to optimise a continuous function. (There are other examples elsewhere in the repo, for next week.)

Your goal should be to

Gain some experience with these algorithms on different problems, exploring the limitations of each one.
Learn to adapt the GA by changing selection, crossover, and mutation.
Understand how the algorithms work in general.

It is recommended to work together, and possibly split tests between you to compare more results. Feel free to test other variations than those proposed below. Make sure you keep a log of the tests that you make, so that you can compare performance.

Overview of source files

GA.py is the main implementation of the algorithms. It defines the GA class which implements a generic optimiser, as well as sample functions for mutation and cross-over.
- Note how you can pass mutation and cross-over functions as arguments when you call the GA.
BinaryBeam.py is a local beam search implemented using the GA engine above (inheritance).
BinaryChromosome.py defines chromosomes for a binary representation of floating point numbers. The GA engine works with arbitrary representations, but in this session we will only use the BinaryChromosome.
TestFunctions.py gives a couple of test functions you can optimise. You should make some for yourself too.
There are test scripts in the main modules GA.py and BinaryBeam.py, as well as a different function tested in testBeam.py and testGA.py.

First exploration

Have a look at the code and the test scripts. Check that they work on your system. What functions are optimised in the test scripts?

Answer the following questions by testing, and tweaking the code as required,

What is the run time of each algorithm? Is there a lot of variation?
How many iterations (generations) and how large population is necessary to get accurate results?
Are the results consistent between runs? Or do you tend to get different, random solutions each time? How much does this depend on the generation count and population size?
Which algorithm is better, GA or Local Beam?
Is the answer the same for both functions?

(Don’t try to guess the answers based on naïve psychology. I phrased the questions before I tested myself, so psychology will draw a blank.)

Statistics

When many different configurations are tested, it is useful to make a script and plot results. The most popular library for such plotting in python is matplotlib. For instance,

Make a loop which iteratively runs nextGeneration and records the best solution so far. Plot the solution against the number of generations.
Make a loop which initiates GAs with different population sizes. Plot the solution against the populations size.
Combine the two previous tricks to record the solution for every generation for each population size. Plot the results as a surface plot in 3D.

Basic Hill Climbing

Setting the population size to one, Local Beam Search degenerates to ordinary Hill Climbing.

What actions (changes) does this hill climber use to find elligible next states?
Test this Hill Climbing algorithm a couple of times on the F5 function (as used in testBeam.py).
Do you get consistent results?
Does the number of iterations make a difference?
What is the major limitation of Hill Climbing?

Varying the cost function

Make a new test script, changing the cost function. You can use one of the test functions from Haupt and Haupt Appendix I (published on BB), or your own.
Can you find functions where the GA performs badly? Where Beam Search is bad?
Do the two algoritms struggle on the same or different cost functions?

Varying the representation

The binary GA uses quantised solutions, representing each floating point number with a small number of bits.

Can you improve the performance by changing the number of bits used per dimenstion? (This is the third argument to the BinaryRepresentation constructor.)
How is the trade-off between run time and accuracy when you change the number of bits.

Variations of the GA

Discuss, what alternatives are there to the default mating, crossover, and mutation functions? Many examples are given in Chapter 2 of Haupt&Haupt.
Implement and test a few such alternatives.
Can you improve the performance on any of the test functions by using such alternative constituent functions?

Learning a string

Let us take an entirely different kind of problem. The cost function is stringlearn1 from TestFunctions. We are searching for a string of length stringlearn1length.

Make a representation class which maps character strings to binary strings
- you can limit the alphabet to upper and lower case letters (52 characters), digits, space, and full stop, for a total of 64 characters.
- replicate the API of the BinaryRepresentation class as far as it is relevant.
- note that the GA uses the getFloat method without assuming that the return value is floating point numbers. If you keep the method name, returning a string, GA will still work with the new representation.
Set up a test with a GA seeking to minimise stringlearn1 over all strings of length stringlearn1length.
Are you able to learn the string correctly and get cost zero? Try different variations of the GA as well as different population sizes and numbers of generations.

Discuss

How does the demo software work for you?
Should any improvements be made?

Debrief

Reflection.

What worked well?
What worked badly?
What have you learnt?
What more do you need to learn?