Complex Environments

Reading
Evaluation as Reference Group
Part 1. Local Search
Part 2. Incomplete information
Part 3. Overview to Date

Reading

Watch video in mp4 or ogg.
Read R&N Chapter 4.

There are broadly three main concepts to consider.

Local Search for Optimisation in Sections 4.1-4.2
- A primary example is gradient descent which should be familiar from any elementary calculus course. See Section 4.2.
- Our focus is discrete problems where calculus does not apply. Section 4.1 discusses a range of variants for this case.
- A core idea is iterative approximation.
- We will return to stochastic beam and other evolutionary algorithms later.
Intelligent action with partial information in Section 4.3-4.4
- In most of the problems we have considered, a complete sequence of moves could be calculated at the start of the game. This requires an environment which is static and fully observable.
- Section 4.3 introduces non-deterministic games and And/Or Search Trees
- Section 4.4 introduces partially observable games and Belief State Modelling
- In both cases you need to observe your surroundings in each round and adjust your plan accordingly.
Section 4.5 introduces Online Search, i.e. search where not the solution (conditional or otherwise) is not computed ahead of time, but interleaved with exploration.

Evaluation as Reference Group

(postponed from last session)

Generic Questions
- Are you comfortable with the goals of the module?
- Do you make progress when you do exercises?
- Are you able to complete exercises?
- Do you gain skills that you feel that you can use in the future?
Alter-Begin-Continue (Yellow Stickers Exercise)

Part 1. Local Search

The basic algorithm is Hill Climbing Search, which can be varied in many ways

Recap

How does Hill Climbing relate to previous algorithms?
Can we use the same state machine model?
Do we always use the same state machine model?

Quiz (Briefing)

Does Hill Climbing always find the minimum?
Both A* and Hill Climbing rely on some heuristic to be minimised. What is the main difference between Hill Climbing and Tree Searches?
How can we avoid getting stuck in a local minimum?
What problems arise when a state has many neighbours?
How can the Hill Climb be sped up when there are many neighbours per state?

Exercise

R&N Exercise 4.1
If you did the 11-Puzzle last weak, can you redo it using Hill Climbing instead?

Part 2. Incomplete information

Quiz (Briefing)

What kind of problems call for and/or trees?
How can you search in an and/or tree?
Why do and/or trees often lead to infinite solution paths?
Do and/or trees suffice to model partially observable states?
What is the difference between and/or trees and trees with belief states? (with respect to the problems they model?)
What is the difference between offline and online search?
What extra considerations are required in online search?

Exercises

Exploration and online search

R&N Exercise 4.13 gives a simple example to compare offline and online search.

A software simulator of the 3x3 maze is available at github. You can use this to implement and test DFS search in an environment which is simpler than the CodinGame exercise below.

CodinGame

The Labyrinth requires online search.

This exercise is difficult, and without prior analysis and modelling, it is almost impossible. Think through the following questions,

What is your state space? What is the goal state?
Consider first the problem of finding the control room. Review each of the three tree search approaches that we know. What are the advantages and disadvantages of each?
- Best-First Search
- BFS: Breadth-First Search
- DFS: Depth-First Search
Consider now the problem of returning to the exit. Review again the three search approaches above. What problems arise when you try to return?
Returning there are two basic approaches. You can backtrack the same route by which you came, or you can make a new search for a better route. What are the advantages and disadvantages of each of these approaches?
Make an outline of you solution before you implement it in CodinGame.

Debrief

Labyrinth Solutions ideas
Code: Code/labyrinth.py

Part 3. Overview to Date

Consider the main models from each of the previous chapters:

Chapter 2: Intelligent Agent (sense-think-act)
- classification of problems and environments
Chapter 3a: State Machine
- tree traversal and search
Chapter 3b: Heuristic Search
Chapter 4a: Local Optimisation
Chapter 4b: Partial Observation and Online Search

For each of the main models and paradigms in Chapters 3 and 4, discuss their main features and purposes. What features of the problem and environment necessitates a given solution paradigm?

The Eight Queens Problem and the \(N\)-Puzzle from pai-exercises. are useful to compare different solutions.
We will not spend a lot of time on Constraint Satisfaction Search (Chapter 5), but it is useful to look at the solution to the Eight Queens Problem.
Where the other algorithms look at the state as one atomic element, a constraint satisfaction search may consider each queen separately.
- Each Queen contributes a number of conflicts to the total.
- Moving that queen may reduce the number of conflicts.
- Hence, in each iteration we look at only one queen.
- If she has zero conflicts, we pick another queen.
- If she has conflicts, we move her to a new square minimising her conflicts.
- We ignore the prior state, so there is a risk that we increase the number of conflicts.
Obviously, there are many variations of constraint satisfaction searches, and the sample implementation is probably not the best one.