# Introductory Session to Machine Learing

# Reading

- Ma 2004 Chapter 1.

# Session

**Briefing**Overview and History- Install and Test Software
- Simple tutorials

**Debrief**questions and answers- recap of linear algebra

# 1 Briefing

## Practical Information

### Information

- Wiki - living document - course content
- BlackBoard - announcements - discussion fora
- Questions - either
- in class
- in discussion fora

- Email will only be answered when there are good reasons not to use public fora.

### Taught Format

- Sessions 4h twice a week
- normally 1h briefing + 2h exercise + 1h debrief (may vary)

- Exercises vary from session to session
- mathematical exercises
- experimental exercises
- implementational exercises

**No**Compulsory Exercises**Feedback in class**- please ask for feedback on partial work

- Keep a diary. Make sure you can refer back to previous partial solution and reuse them.

### Learning Outcomes

- Knowledge
- The candidate can explain fundamental mathematical models for digital imaging, 3D models, and machine vision
- The candidate are aware of the principles of digital cameras and image capture

- Skills
- The candidate can implemented selected techniques for object recognition and tracking
- The candidate can calibrate cameras for use in machine vision systems

- General competence
- The candidate has a good analytic understadning of machine vision and of the collaboration between machine vision and other systems in robotics
- The candidate can exploit the connection between theory and application for presenting and discussing engineering problems and solutions

### Exam

- Oral exam \(\sim 20\) min.
- First seven minutes are
*yours*- make a case for your grade wrt. learning outcomes
- your own implementations may be part of the case
- essentially that you can explain the implementation analytically

- The remaing 13-14 minutes is for the examiner to explore further
- More detailed assessment criteria will be published later

## Vision

- Vision is a 2D image on the retina
- Each cell perceives the light intencity of colour of the light projected thereon

- Easily replicated by a digital camera
- Each pixel is light intencity sampled at a given point on the image plane

## Cognition

- Human beings see 3D objects
- not pixels of light intencity

- We
*recognise*objects -*cognitive schemata*- we see a
*ball*- not a round patch of white - we remember a
*tennis match*- more than four people with white clothes and rackets

- we see a
- We observe objects arranged in depth
- in front of and behind the net
- even though they are all patterns in the same image plane

- 3D reconstruction from 2D retina image
- and we do not even think about how

## Applications

- Artificial systems interact with their surroundings
- navigate in a 3D environment

- Simpler applications
- face recognition
- tracking in surveillance cameras
- medical image diagnostics (classification)
- image retrieval (topics in a database)
- detecting faulty products on a conveyor belt (classification)
- aligning products on a conveyor belt

- Other advances in AI creates new demands on vision
- 20 years ago, walking was a major challenge for robots
- now robots walk, and they need to see where they go …

## Focus

- Artificial systems interact with their surroundings
- navigate in a 3D environment

- This means
- Geometry of multiple views
- Relationship between theory and practice
- … between analysis and implementation

- Mathematical approach
- inverse problem; 3D to 2D is easy, the inverse is hard
- we need to understand the geometry to know what we program

## History

- 1435:
*Della Pictura*- first general treatise on perspective - 1648 Girard Desargues - projective geometry
- 1913 Kruppa: two views of five points suffice to find
- relative transformation
- 3D location of the points
- (up to a finite number of solutions)

- mid 1970s: first algorithms for 3D reconstruction
- 1981 Longuet-Higgins: linear algorithm for structure and motion
- late 1970s E. D. Dickmans starts work on vision-based autonomous cars
- 1984 small truck at 90 km/h on empty roads
- 1994: 180 km/h, passing slower cars

## Python

- Demos and tutorials in Python
- you can use whatever language you want
- we avoid Jupyter to make sure we can use camera and interactive displays easily

- Demos and help on Unix-like system (may or may not include Mac OS)
- In the exercise sessions
- install necessary software
- use the tutorials to see that things work as expected

- In the debrief, we will start briefly on the mathematical modelling

# 2 Tutorials

# 3 Debrief

- Discuss problems arising from the practical session.
- Repeat basic linear algebra (below).
- Possibly start on 3D Mathematics - probably not though.

## Vectors and Points

- A
*point*in space \(\mathbf{X} = [X_1,X_2,X_3]^\mathrm{T}\in\mathbb{R}^3\) - A
*bound vector*, from \(\mathbf{X}\) to \(\mathbf{Y}\): \(\vect{\mathbf{XY}}\) - A
*free vector*is the same difference, but without any specific anchor point- represented as \(\mathbf{Y} - \mathbf{X}\)

- Set of free vectors form a linear vector space
**note**points do not- The sum of two vectors is another vector
- The sum of two points is not a point

## Dot product (inner product)

\[x=\begin{bmatrix}x_1\\x_2\\x_3\end{bmatrix}\quad y=\begin{bmatrix}y_1\\y_2\\y_3\end{bmatrix}\]

**Inner product** \[\langle x,y\rangle = x^\mathrm{T}y = x_1y_1+x_2y_2+x_3y_3\]

Euclidean **Norm** \[||x|| = \sqrt{\langle x,x\rangle}\]

**Orthogonal vectors** when \(\langle x,y\rangle=0\)

## Cross product

\[x\times y = \begin{bmatrix} x_2y_3 - x_3y_2 \\ x_3y_1 - x_1y_3 \\ x_1y_2 - x_2y_1 \end{bmatrix} \in \mathbb{R}^3\]

Observe that

- \(y\times x = -x\times y\)
- \(\langle x\times y, y\rangle= \langle x\times y, x\rangle\)

\[x\times y = \hat xy \quad\text{where}\quad \hat x = \begin{bmatrix} 0 -x_3 x_2 \\ x_3 0 -x_1 \\ -x_2 x_1 0 \end{bmatrix} \in \mathbb{R}^{3\times3}\]

\(\hat x\) is a **skew-symmetric** matrix because \(\hat x=-\hat x^\mathrm{T}\)

## Right Hand Rule

**TODO**

## Skew-Symmetric Matrix

**TODO**

## Change of Basis

**TODO**