# Machine Vision

## Introductory Session

NTNU, Noregs Teknisk-Naturvitskaplege Universitet

Thu 25 Aug 10:11:51 UTC 2022

1. Briefing Overview and History
2. Install and Test Software - Simple tutorials
3. Debrief questions and answers recap of linear algebra

# Introduction

• What is you name?
• Why do you take this module?

# Practical Information

Questions - either in class or in discussion fora

... or email if it is personal

# Taught Format

• Sessions 4h twice a week
• 1h briefing + 2h exercise + 1h debrief (may vary)
• Variety of exercise genres
• No Compulsory Exercises
• but not less work
• Feedback in class
• Keep a diary.
• Make sure you can refer back to previous partial solution and reuse them.
• Use the textbooks to learn whet you need to do
• Not as a checklist of what you need to learn

# Learning Outcomes

Knowledge
• The candidate can explain fundamental mathematical models for digital imaging, 3D models, and machine vision
• The candidate are aware of the principles of digital cameras and image capture
• Skills
• The candidate can implemented selected techniques for object recognition and tracking
• General competence
• The candidate has a good analytic understanding of machine vision and of the collaboration between machine vision and other systems in robotics
• The candidate can exploit the connection between theory and application for presenting and discussing engineering problems and solutions
• # Exam

• Oral exam $$\sim 20$$ min.
• First seven minutes are yours
• your own implementations may be part of the case
• essentially that you can explain the implementation analytically
• The remaing 13-14 minutes is for the examiner to explore further
• More detailed assessment criteria will be published later

# Vision

Eye Model from Introduction to Psychology by University of Minnesota

• Vision is a 2D image on the retina
• Each cell perceives the light intencity of colour of the light projected thereon
• Easily replicated by a digital camera
• Each pixel is light intencity sampled at a given point on the image plane

# Cognition

1912 International Lawn Tennis Challenge

• Human beings see 3D objects
• not pixels of light intencity
• We recognise objects - cognitive schemata
• we see a ball - not a round patch of white
• we remember a tennis match - rather than four people with white clothes and rackets
• We observe objects arranged in depth
• in front of and behind the net
• even though they are all patterns in the same image plane
• 3D reconstruction from 2D retina image
• and we do not even think about how

# Applications

• Artificial systems interact with their surroundings
• navigate in a 3D environment
• Simpler applications
• face recognition
• tracking in surveillance cameras
• medical image diagnostics (classification)
• image retrieval (topics in a database)
• detecting faulty products on a conveyor belt (classification)
• aligning products on a conveyor belt
• Other advances in AI creates new demands on vision
• 20 years ago, walking was a major challenge for robots
• now robots walk, and they need to see where they go ...

Wikimedia

# Focus for the module

• Artificial systems interact with their surroundings
• navigate in a 3D environment
• This means
• Geometry of multiple views
• Relationship between theory and practice
• ... between analysis and implementation
• Mathematical approach
• inverse problem; 3D to 2D is easy, the inverse is hard
• we need to understand the geometry to know what we program

# Subproblems

• 3D modelling
• Linear Algebra
• Distortion and calibration
• 2D Geometry
• Feature detection
• image (signal) processing
• possibly machine learning

# Big Problem

Vision System

i.e. putting it all together

# History

• 1435 Della Pictura - first general treatise on perspective
• 1648 Girard Desargues - projective geometry
• 1913 Kruppa: two views of five points suffice to find
• relative transformation
• 3D location of the points
• (up to a finite number of solutions)
• mid 1970s first algorithms for 3D reconstruction
• 1981 Longuet-Higgins: linear algorithm for structure and motion
• late 1970s E. D. Dickmans starts work on vision-based autonomous cars
• 1984 small truck at 90 km/h on empty roads
• 1994 180 km/h, passing slower cars

# Python

• Demos and tutorials in Python
• not Jupyter - we need to interface cameras and interactive displays easily
• you can use whatever language you want
• Demos and help on Unix-like system
• exercise session today
• install necessary software
• use the tutorials to see that things work as expected
• In the debrief, we will start briefly on the mathematical modelling
1. Quick demo
3. After the exercise we resume for a debrief

# Debrief

1. What have you learnt from the exercise?
2. Did you encounter any problems or questions to be discussed?
3. 3D Mathematics

# Vectors and Points

A point in space $\mathbf{X} = [X_1,X_2,X_3]^\mathrm{T}\in\mathbb{R}^3$
A bound vector, from $\mathbf{X}$ to $\mathbf{Y}$$\vec{\mathbf{XY}} A free vector is the same difference, but without any specific anchor pointrepresented as \mathbf{Y} - \mathbf{X} • Set of free vectors form a linear vector space • note! points do not • The sum of two vectors is another vector • The sum of two points is not a point # Dot product (inner product)$$x=\begin{bmatrix}x_1\\x_2\\x_3\end{bmatrix}\quad y=\begin{bmatrix}y_1\\y_2\\y_3\end{bmatrix}\langle x,y\rangle = x^\mathrm{T}y = x_1y_1+x_2y_2+x_3y_3$$Euclidean Norm: ||x|| = \sqrt{\langle x,x\rangle} Orthogonal vectors when \langle x,y\rangle=0 # Cross product$$x\times y = \begin{bmatrix} x_2y_3 - x_3y_2 \\ x_3y_1 - x_1y_3 \\ x_1y_2 - x_2y_1 \end{bmatrix} \in \mathbb{R}^3$$Observe that 1. y\times x = -x\times y 2. \langle x\times y, y\rangle= \langle x\times y, x\rangle$$x\times y = \hat xy \quad\text{where}\quad \hat x = \begin{bmatrix} 0 & -x_3 & x_2 \\ x_3 & 0 & -x_1 \\ -x_2 & x_1 & 0 \end{bmatrix} \in \mathbb{R}^{3\times3}$$\hat x is a skew-symmetric matrix because \hat x=-\hat x^\mathrm{T} i.e. the cross-product is a linear operation # Right Hand Rule By Acdx - Self-made, based on Image:Right_hand_cross_product.png,BY-SA 3.0, Wikimedia Commons # Matrix Arithmetics Matrices define linear operations on vectors$$ \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \\ \end{bmatrix} \cdot \begin{bmatrix} 2 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 2 \\ 1 \\ 0 \end{bmatrix}  \begin{bmatrix} a_{11} & a_{21} & a_{23} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \\ \end{bmatrix} \cdot \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} a_{11}x_1 & a_{21}x_2 & a_{23}x_3 \\ a_{21}x_1 & a_{22}x_2 & a_{23}x_3 \\ a_{31}x_1 & a_{32}x_2 & a_{33}x_3 \\ \end{bmatrix}$\$

• Change of Basis
• Rigid Body Motion