# Introductory Session to Machine Learing

• Ma 2004 Chapter 1.

# Session

1. Briefing Overview and History
2. Install and Test Software
• Simple tutorials
• recap of linear algebra

# 1 Briefing

## Practical Information

### Information

• Wiki - living document - course content
• BlackBoard - announcements - discussion fora
• Questions - either
• in class
• in discussion fora
• Email will only be answered when there are good reasons not to use public fora.

### Taught Format

• Sessions 4h twice a week
• normally 1h briefing + 2h exercise + 1h debrief (may vary)
• Exercises vary from session to session
• mathematical exercises
• experimental exercises
• implementational exercises
• No Compulsory Exercises
• Feedback in class
• Keep a diary. Make sure you can refer back to previous partial solution and reuse them.

### Learning Outcomes

• Knowledge
• The candidate can explain fundamental mathematical models for digital imaging, 3D models, and machine vision
• The candidate are aware of the principles of digital cameras and image capture
• Skills
• The candidate can implemented selected techniques for object recognition and tracking
• The candidate can calibrate cameras for use in machine vision systems
• General competence
• The candidate has a good analytic understadning of machine vision and of the collaboration between machine vision and other systems in robotics
• The candidate can exploit the connection between theory and application for presenting and discussing engineering problems and solutions

### Exam

• Oral exam $$\sim 20$$ min.
• First seven minutes are yours
• your own implementations may be part of the case
• essentially that you can explain the implementation analytically
• The remaing 13-14 minutes is for the examiner to explore further
• More detailed assessment criteria will be published later

## Vision

• Vision is a 2D image on the retina
• Each cell perceives the light intencity of colour of the light projected thereon
• Easily replicated by a digital camera
• Each pixel is light intencity sampled at a given point on the image plane

## Cognition

• Human beings see 3D objects
• not pixels of light intencity
• We recognise objects - cognitive schemata
• we see a ball - not a round patch of white
• we remember a tennis match - more than four people with white clothes and rackets
• We observe objects arranged in depth
• in front of and behind the net
• even though they are all patterns in the same image plane
• 3D reconstruction from 2D retina image
• and we do not even think about how

## Applications

• Artificial systems interact with their surroundings
• navigate in a 3D environment
• Simpler applications
• face recognition
• tracking in surveillance cameras
• medical image diagnostics (classification)
• image retrieval (topics in a database)
• detecting faulty products on a conveyor belt (classification)
• aligning products on a conveyor belt
• Other advances in AI creates new demands on vision
• 20 years ago, walking was a major challenge for robots
• now robots walk, and they need to see where they go …

## Focus

• Artificial systems interact with their surroundings
• navigate in a 3D environment
• This means
• Geometry of multiple views
• Relationship between theory and practice
• … between analysis and implementation
• Mathematical approach
• inverse problem; 3D to 2D is easy, the inverse is hard
• we need to understand the geometry to know what we program

## History

• 1435: Della Pictura - first general treatise on perspective
• 1648 Girard Desargues - projective geometry
• 1913 Kruppa: two views of five points suffice to find
• relative transformation
• 3D location of the points
• (up to a finite number of solutions)
• mid 1970s: first algorithms for 3D reconstruction
• 1981 Longuet-Higgins: linear algorithm for structure and motion
• late 1970s E. D. Dickmans starts work on vision-based autonomous cars
• 1984 small truck at 90 km/h on empty roads
• 1994: 180 km/h, passing slower cars

## Python

• Demos and tutorials in Python
• you can use whatever language you want
• we avoid Jupyter to make sure we can use camera and interactive displays easily
• Demos and help on Unix-like system (may or may not include Mac OS)
• In the exercise sessions
• install necessary software
• use the tutorials to see that things work as expected
• In the debrief, we will start briefly on the mathematical modelling

# 3 Debrief

1. Discuss problems arising from the practical session.
2. Repeat basic linear algebra (below).
3. Possibly start on 3D Mathematics - probably not though.

## Vectors and Points

• A point in space $$\mathbf{X} = [X_1,X_2,X_3]^\mathrm{T}\in\mathbb{R}^3$$
• A bound vector, from $$\mathbf{X}$$ to $$\mathbf{Y}$$: $$\vect{\mathbf{XY}}$$
• A free vector is the same difference, but without any specific anchor point
• represented as $$\mathbf{Y} - \mathbf{X}$$
• Set of free vectors form a linear vector space
• note points do not
• The sum of two vectors is another vector
• The sum of two points is not a point

## Dot product (inner product)

$x=\begin{bmatrix}x_1\\x_2\\x_3\end{bmatrix}\quad y=\begin{bmatrix}y_1\\y_2\\y_3\end{bmatrix}$

Inner product $\langle x,y\rangle = x^\mathrm{T}y = x_1y_1+x_2y_2+x_3y_3$

Euclidean Norm $||x|| = \sqrt{\langle x,x\rangle}$

Orthogonal vectors when $$\langle x,y\rangle=0$$

## Cross product

$x\times y = \begin{bmatrix} x_2y_3 - x_3y_2 \\ x_3y_1 - x_1y_3 \\ x_1y_2 - x_2y_1 \end{bmatrix} \in \mathbb{R}^3$

Observe that

• $$y\times x = -x\times y$$
• $$\langle x\times y, y\rangle= \langle x\times y, x\rangle$$

$x\times y = \hat xy \quad\text{where}\quad \hat x = \begin{bmatrix} 0 -x_3 x_2 \\ x_3 0 -x_1 \\ -x_2 x_1 0 \end{bmatrix} \in \mathbb{R}^{3\times3}$

$$\hat x$$ is a skew-symmetric matrix because $$\hat x=-\hat x^\mathrm{T}$$

TODO

TODO

TODO