Complex Environments


Read R&N Chapter 4.

  1. Local Search for Optimisation in Sections 4.1-4.2
    • A primary example is gradient descent which should be familiar from any elementary calculus course. See Section 4.2.
    • Our focus is discrete problems where calculus does not apply. Section 4.1 discusses a range of variants for this case.
    • We will return to stochastic beam and other evolutionary algorithms later.
  2. Intelligent action with partial information in Section 4.3-4.4
    • In most of the problems we have considered, a complete sequence of moves could be calculated at the start of the game. This requires an environment which is static and fully observable.
    • Section 4.3 introduces non-deterministic games and And/Or Search Trees
    • Section 4.4 introduces partially observable games and Belief State Modelling
    • In both cases you need to observe your surroundings in each round and adjust your plan accordingly.
  3. Section 4.5 introduces Online Search, i.e. search where not the solution (conditional or otherwise) is not computed ahead of time, but interleaved with exploration.

Part 1. Local Search


The basic algorithm is Hill Climbing Search. (See video in mp4 or ogg.)

You should be familiar with

Different move strategies:

  • Greedy Search
  • Stochastic Search
  • First Choice
  • Gradient Descent in the Continuous Case. We will not expand on the continuous case in this module, but the main principles should be familiar from u/g mathematics and form a useful starting point.

Variations on algorithm level

  • Random Restart
  • Simulated Annealing
  • Local Beam (deterministic)
  • Evolutionary Algorithms (including stochastic beam). Genetic Algorithms will be handled in detail later in the semester.


R&N Exercise 4.1

Part 2. Incomplete information


  1. Non-Deterministic Problems - Belief States
    • blind agents: coercing the state
    • partial
  2. Partially Observable Environments
  3. Online search


R&N Exercise 4.13 gives a simple example to compare offline and online search.

A software simulator of the 3x3 maze is available at github. You can use this to implement and test DFS search in an environment which is simpler than the CodinGame exercise below.


The Labyrinth requires online search.

This exercise is difficult, and without prior analysis and modelling, it is almost impossible. Think through the following questions,

  1. What is your state space? What is the goal state?
  2. Consider first the problem of finding the control room. Review each of the three tree search approaches that we know. What are the advantages and disadvantages of each?
    • Best-First Search
    • BFS: Breadth-First Search
    • DFS: Depth-First Search
  3. Consider now the problem of returning to the exit. Review again the three search approaches above. What problems arise when you try to return?
  4. Returning there are two basic approaches. You can backtrack the same route by which you came, or you can make a new search for a better route. What are the advantages and disadvantages of each of these approaches?
  5. Make an outline of you solution before you implement it in CodinGame.


Part 3. Overview to Date

Consider the main models from each of the previous chapters:

  • Chapter 2: Intelligent Agent (sense-think-act)
    • classification of problems and environments
  • Chapter 3a: State Machine
    • tree traversal and search
  • Chapter 3b: Heuristic Search
  • Chapter 4a: Local Optimisation
  • Chapter 4b: Partial Observation and Online Search

For each of the main models and paradigms in Chapters 3 and 4, discuss their main features and purposes. What features of the problem and environment necessitates a given solution paradigm?