Revision 7fd81549d7507894b92c8f1a3a3c8dacaa802b8c (click the page title to view the current version)


Changes from beginning to 7fd81549d7507894b92c8f1a3a3c8dacaa802b8c

title: Introductory Session to Machine Learing

# Reading

+ Ma 2004 Chapter 1.

# Session

1.  **Briefing** Overview and History
2.  Install and Test Software
    - Simple tutorials
3.  **Debrief** questions and answers
    - recap of linear algebra

# 1 Briefing

## Practical Information

### Information

+ Wiki - living document - course content
+ BlackBoard - announcements - discussion fora
+ Questions - either
    - in class
    - in discussion fora
+ Email will only be answered when there are good reasons not to use public fora.

### Taught Format

+ Sessions 4h twice a week
    - normally 1h briefing + 2h exercise + 1h debrief (may vary)
+ Exercises vary from session to session
    + mathematical exercises
    + experimental exercises
    + implementational exercises
+ **No** Compulsory Exercises
+ **Feedback in class** 
    - please ask for feedback on partial work
+ Keep a diary.  Make sure you can refer back to previous partial solution and reuse

### Learning Outcomes

+ Knowledge
    - The candidate can explain fundamental mathematical models for digital imaging,
      3D models, and machine vision
    - The candidate are aware of the principles of digital cameras and image capture
+ Skills
    - The candidate can implemented selected techniques for object recognition and
    - The candidate can calibrate cameras for use in machine vision systems
+ General competence
    - The candidate has a good analytic understadning of machine vision and of the 
      collaboration between machine vision and other systems in robotics
    - The candidate can exploit the connection between theory and application for 
      presenting and discussing engineering problems and solutions

### Exam

+ Oral exam $\sim 20$ min.
+ First seven minutes are *yours* 
    - make a case for your grade wrt. learning outcomes
    - your own implementations may be part of the case
    - essentially that you can explain the implementation analytically
+ The remaing 13-14 minutes is for the examiner to explore further
+ More detailed assessment criteria will be published later

## Vision

![Eye Model from *Introduction to Psychology* by University of Minnesota](Images/eye.jpg)

+ Vision is a 2D image on the retina
    + Each cell perceives the light intencity of colour of the light projected thereon
+ Easily replicated by a digital camera
    + Each pixel is light intencity sampled at a given point on the image plane

## Cognition

![1912 International Lawn Tennis Challenge](Images/tennis.jpg)

+ Human beings see 3D objects
    - not pixels of light intencity
+ We *recognise* objects - *cognitive schemata*
    - we see a *ball* - not a round patch of white
    - we remember a *tennis match* - 
      more than four people with white clothes and rackets
+ We observe objects arranged in depth
    - in front of and behind the net
    - even though they are all patterns in the same image plane
+ 3D reconstruction from 2D retina image
    - and we do not even think about how

## Applications

- Artificial systems interact with their surroundings
    - navigate in a 3D environment
- Simpler applications
    - face recognition
    - tracking in surveillance cameras
    - medical image diagnostics (classification)
    - image retrieval (topics in a database)
    - detecting faulty products on a conveyor belt (classification)
    - aligning products on a conveyor belt 
- Other advances in AI creates new demands on vision
    - 20 years ago, walking was a major challenge for robots
    - now robots walk, and they need to see where they go ...

## Focus

- Artificial systems interact with their surroundings
    - navigate in a 3D environment
- This means
    - Geometry of multiple views
    - Relationship between theory and practice
    - ... between analysis and implementation
- Mathematical approach
    - inverse problem; 3D to 2D is easy, the inverse is hard
    - we need to understand the geometry to know what we program

##  History 

- 1435: *Della Pictura* - first general treatise on perspective
- 1648 Girard Desargues - projective geometry
- 1913 Kruppa: two views of five points suffice to find
    - relative transformation 
    - 3D location of the points 
    - (up to a finite number of solutions)
- mid 1970s: first algorithms for 3D reconstruction
- 1981 Longuet-Higgins: linear algorithm for structure and motion
- late 1970s E. D. Dickmans starts work on vision-based autonomous cars
    - 1984 small truck at 90 km/h on empty roads
    - 1994: 180 km/h, passing slower cars

## Python

- Demos and tutorials in Python
    - you can use whatever language you want
    - we avoid Jupyter to make sure we can use camera and interactive displays easily
- Demos and help on Unix-like system (may or may not include Mac OS)
- In the exercise sessions
    - install necessary software
    - use the tutorials to see that things work as expected
- In the debrief, we will start briefly on the mathematical modelling

# 2 Tutorials

+ [Introduction]()

# 3 Debrief

1.  Discuss problems arising from the practical session.
2.  Repeat basic linear algebra (below).
3.  Possibly start on [3D Mathematics]() - probably not though.

## Vectors and Points

+ A *point* in space $\mathbf{X} = [X_1,X_2,X_3]^\mathrm{T}\in\mathbb{R}^3$
+ A *bound vector*, from $\mathbf{X}$ to $\mathbf{Y}$: $\vect{\mathbf{XY}}$
+ A *free vector* is the same difference, but without any specific anchor point
   + represented as $\mathbf{Y} - \mathbf{X}$ 
+ Set of free vectors form a linear vector space
   + **note** points do not
   + The sum of two vectors is another vector
   + The sum of two points is not a point

## Dot product (inner product)


**Inner product**
$$\langle x,y\rangle = x^\mathrm{T}y = x_1y_1+x_2y_2+x_3y_3$$

Euclidean **Norm**
$$||x|| = \sqrt{\langle x,x\rangle}$$

**Orthogonal vectors** when $\langle x,y\rangle=0$

## Cross product

$$x\times y = 
  x_2y_3 - x_3y_2 \\
  x_3y_1 - x_1y_3 \\
  x_1y_2 - x_2y_1 
\end{bmatrix} \in \mathbb{R}^3$$

Observe that

+ $y\times x = -x\times y$
+ $\langle x\times y, y\rangle= \langle x\times y, x\rangle$

$$x\times y = \hat xy \quad\text{where}\quad \hat x =  
  0 -x_3 x_2 \\
  x_3 0 -x_1 \\
  -x_2 x_1  0
\end{bmatrix} \in \mathbb{R}^{3\times3}$$

$\hat x$ is a **skew-symmetric** matrix because $\hat x=-\hat x^\mathrm{T}$

## Right Hand Rule


## Skew-Symmetric Matrix


## Change of Basis