Revision ad6970a9ab8bdbefe46307d1b8cc0ff2a637c63e (click the page title to view the current version)

Adversarial Search

Changes from ad6970a9ab8bdbefe46307d1b8cc0ff2a637c63e to 210743847b809eaa1dd2a4518c0b950fa8e7f729

---
title: Adversarial Search
categories: session
---

# Reading

R&N Chapter 6

+ The fundamental concept is **two-player, zero-sum games**
    + The basic solution technique is **minimax search**
+ Minimax search grows exponentially
    + heuristic searches are important (Section 6.3)
+ Video lecture [mp4](multiplayer.mp4)/[ogg](multiplayer.ogg)
  provides a few anchor points for the material.
  It does not replace the text book.

# Briefing


## Two-player, Zero Sum games

+ **State-machine**
    - to-move: state $\to$ player
    - actions: state $\to$ {action}
    - result: state $\times$ action $\to$ state
    - utility: state $\times$ player $\to\mathbb{R}$
+ Minimax algorithm: 
    - maximises utility for the player currently to move
    - **Tree Search**: optimise utility bottom-up
        - **min nodes and max nodes**
    - exhaustive research
+ Caveats and variations
    - Branching factor
    - multi-player games - alliances and trust
    - co-operative games
    - $\alpha\beta$ pruning
    - heuristic searches
    - Type A and Type B:
        - move generation
        - move evaluation
+ **Monte Carlo** Tree Search
    - Monte Carlo $\sim$ Stochastic Simulation
    - Balance Exploitation and Exploration
+ More Caveats
    + Stochastic Games - **Chance Nodes**
    + Partially Observable Games 
        - the percept sequence is no longer the opponent's move.
    + Limitations - Section 6.7
        + Intractible
	+ Approximations and Assumptions
	+ Individual moves - no sight of the bigger picture


# Exercise

## Tic Tac Toe

+ [Code from github](https://github.com/hgeorgsch/pai-exercises/tree/main/TicTacToe)

I was not able to find suitable exercises on CodinGame, so instead,
I have provided a simulator for you.  You should 

1.  Clone the git repo, `git clone https://github.com/hgeorgsch/pai-exercises.git`
2.  Change to the `TicTacToe` subdirectory
3.  Modify the template to implement your intelligent agent.
    You should use the minimax algorithm as described for two-player,
    zero sum games.
4.  Play the game, using the test scripts: `python3 ttt.py`
5.  Consult the README file for details.

This assumes that you have git and python3 installed.

## Discussion

From [R&N Exercises](https://aimacode.github.io/aima-exercises/game-playing-exercises/)

+ Exercise 1
+ Exercise 3

## Monte Carlo Tree Search

+ [Can you save the forest - Episode 1](https://www.codingame.com/training/medium/can-you-save-the-forest---episode-1)

# Debrief

All the algorithms we are studied are based on (discrete) **state machine models** and
**tree search algorithms** in the graph defined by the states and transitions in the
state machines.  It is crucial that we understand the basic structure of these models
and algorithms well, so that we can adapt them to the different special cases we
encounter.
All the algorithms we are studied are based on (discrete) 
**state machine models** and **tree search algorithms** 
in the graph defined by the states and transitions in the
state machines.
It is crucial that we understand the basic structure of these 
models and algorithms well, so that we can adapt them to the 
different special cases we encounter.

+ exhaustive searches in small state spaces
    - BFS
    - DFS
+ non-deterministic environments lead to random transitions
    + modelled by and-or trees
+ partial observation leads to belief states
+ in offline search we can jump to arbitrary nodes to restart the search
    + in online searches we need to backtrack by actual actions in the environment
+ two-player games lead to adversarial transitions
    + Minimax algorithm considers both own moves (max) and adversarial moves (min)
+ Pruning may be essential to speed up the algorithm
    + Dijkstra prunes subtrees which are known to give longer paths than the best one known
    + alpha-beta pruning applies to minimax search
+ Heuristics are estimates of the value/cost of a node.  This allows
    + Greedu algorithms and best first searches
    + Heuristic pruning, discarding subtrees which look bad, but with no guarantee