# Tracking Features

- aperture problem [page 78]

# Briefing

## Corners

- What are distinctive points in the image?
- Distinctive points can (to some extent) be matched in two different images.

## Corner Correspondence

- Two images of the same scene \(I_1,I_2: \Omega\subset\mathbb{R}^2\to\mathbb{R}_+ ; \mathbf{x}\mapsto I_1(\mathbf{x}),I_2(\mathbf{x})\)
- Different in general

*Why are they different?*

- Firstly - different colour in different directions
- Lambertian assumption

- Secondly - noise
- let’s assume this is insignificant

- Thirdly - same point in different positions
- different image points \(\mathbf{x}_1,\mathbf{x}_2\) correspond to the same 3D point \(p\)

**Note** Two approaches:

**Today**Motion modelling using differentiation- adequate for slow motion, say 2-3 pixels per frame
- allows for subpixel accuracy

*Next Session:*feature descriptors

## Motion Model

Suppose we have a feature point \(\mathbf{x}_1\) in Image 1. The corresponding feature \(\mathbf{x}_2\) in Image 2 is the result of a movement \(h:\mathbb{R}^2\to\mathbb{R}^2\) so that \(\mathbf{x}_2=h(\mathbf{x}_1)\).

### Brightness Constancy Constraint

- Suppose we photograph empty space except for a single point \(p\)
*Brightness Constancy Constraint*

\[I_1(\mathbf{x}_1) = I_2(\mathbf{x}_2) \sim \mathcal{R}(p)\]

- Simple dislocation from \(\mathbf{x}_1\) to \(\mathbf{x}_2\)
- Motion: \(h: \mathbf{x}_1\mapsto\mathbf{x}_2\) so that \[\forall \mathbf{x}_1\in\Omega\cap h^{-1}(\Omega)\subset\mathbb{R}^{2}, \;I_1(\mathbf{x}_1) = I_2(h(\mathbf{x}_1))\]

## Feature Tracking

Estimator \[\hat h = \arg\min_h\sum_{\tilde{\mathbf{x}}\in W(\mathbf{x})} ||I_1(\tilde{\mathbf{x}})-I_2(h(\tilde{\mathbf{x}}))||^2\]

- The window, or aperture, \(W(\vec{x})\)
- Choose \(h\) from a family of functions, parameterised by \(\alpha\)
- translational: \(\alpha=\Delta\mathbf{x}\)
- affine: \(\alpha=\{A,\mathbf{d}\}\)

**Aperture problem:**cannot distinguish points on a blank wall

## Caveat: AlternativeMotion Models

### Motion Models**Translational Motion Model:**\[h(\mathbf{x}_1) = \mathbf{x}_1 + \mathbf{\Delta x}\]**Affine Motion Model:**\[h(\mathbf{x}_1) = A\mathbf{x}_1 + \mathbf{d}\]**Projective Motion Model:**\[h(\mathbf{x}_1) = H\mathbf{x}_1\] where \(H\in\mathbb{R}^{3\times3}\) is defined up to a scalar factor.

## Caveat: Intencity Transformation

### Intencity Transformation- Need to accept changes to the intencity

\[I_1(\mathbf{x}_1) = I_2(h(\mathbf{x}_1)) + n(h(\mathbf{x}_1))\]

- Occlusions
- Non-Lambertian reflection
- Taken at different time? Different ambient light?

### Infinitesimal Model

- Consider simple translational model \[I_1(\textbf{x})= I_2(h(\textbf{x}))= I_2(\textbf{x}+\Delta\textbf{x})\]
- Consider infitesimally small \(\Delta\textbf x\)
- Model on a time axis
- two images taken infinitesimally close in time
- … under motion

- First write \(\mathbf{\Delta x} = \mathbf{u}dt\), and rewrite the brightness constancy \[I(\mathbf{x}(t),t) = I(\mathbf{x}(t)+\mathbf{u}dt,t+dt)\]
- Apply Taylor series expansion and ignore higher-order terms \[\nabla I(\mathbf{x}(t),t)^\mathrm{T}\mathrm{u} + I_t(\mathbf{x}(t),t) = 0\] where \[\nabla I(\mathbf{x},t) = \begin{bmatrix} I_x(\mathbf{x},t)\\ I_y(\mathbf{x},t) \end{bmatrix} = \begin{bmatrix}\frac{\partial I}{\partial x}(\mathbf{x},t)\\ \frac{\partial I}{\partial y}(\mathbf{x},t) \end{bmatrix} \in\mathbb{R}^2\] and \[I_t(\mathbf{x},t) = \frac{\partial I(\mathbf{x},t)}{\partial t}\in \mathbb{R}\]
- Simplify \[\nabla I(\mathbf{x}(t),t)^\mathrm{T}\mathrm{u} + I_t(\mathbf{x}(t),t) = 0\]

*Brightness Constancy Constraint*for the simplest possible continuous model- Two applications
- optical flow: fix a position \(\mathbf x\) and consider particles passing through
- feature tracking: fix a particle \(x(t)\) an track it through space

### Solving for \(\textbf{u}\)

- Consider the equation \[\nabla I^\mathrm{T}\mathrm{u} + I_t = 0\]
- This is our
**target equation**.

- This is our
- There are infititly many solutions,
*aperture problem*- one scalar equation with two unknowns (\(\mathrm{u}\) is a 2D vector)

- We can solve for the component in the direction of the gradient though

- Scalar projection of \(\mathbf u\) onto \(\nabla I\). \[\frac{\nabla I^\mathrm{T}\mathrm{u}}{||\nabla I||} = - \frac{I_t}{||\nabla I||} \]
- Multiplying by \(\nabla I/||\nabla I||\), we get the vector projection: \[\mathbf u_n = \frac{\nabla I^\mathrm{T}\mathrm{u}}{||\nabla I||}\cdot\frac{\nabla I}{||\nabla I||} = - \frac{I_t}{||\nabla I||}\cdot\frac{\nabla I}{||\nabla I||} \]

### Least squared errors estimate

- Consider again the
*target equation*\[\nabla I^\mathrm{T}\mathrm{u} + I_t = 0\]- Equality is unrealistic - too much noise
- Minimise instead the error: \[E=\nabla I^\mathrm{T}\mathrm{u} + I_t\]

- Integrate over a window with sufficient texture
- allows us to estimate \(u\) in two dimensions
- Minimise the sum of squared errors over the window: \[E_b(\mathbf{u}) = \sum_{W(\mathbf{x})} [\nabla I^T(\tilde{\mathbf{x}},t)\mathbf{u}(\mathbf{x})+I_t(\tilde{\mathbf{x}},t)]^2\]

- The solution is standard optimisation by calculus
- cf. least squares regression in statistics
- cf. extremal points in first-year calculus

- Differentiate \[\nabla E_b(\mathbf{u}) = 2\sum_{W(\mathbf{x})} \nabla I [\nabla I^T\mathbf{u}+I_t]\]
- Spelling out the matrices, we have \[\nabla E_b(\mathbf{u}) = 2\sum_{W(\mathbf{x})} \bigg(\begin{bmatrix} I_x^2 & I_xI_y \\ I_xI_y & I_y^2\end{bmatrix}\mathbf{u} + \begin{bmatrix} I_xI_t \\ I_yI_t\end{bmatrix}\bigg)\]
- To minimise \(E_b\), the derivative should be zero \[0 = \begin{bmatrix} \sum I_x^2 & \sum I_xI_y \\ \sum I_xI_y & \sum I_y^2 \end{bmatrix}\mathbf{u} + \begin{bmatrix} \sum I_xI_t \\ \sum I_yI_t\end{bmatrix}\]
- We refer to the first matrix as \(G\), so that \[G\mathbf{u} + \mathbf{b} = 0\]
- If \(G\) is non-singular, we have \[\mathbf{u} = - G^{-1}\mathbf{b}\]
- This gives us the motion vector \(\mathbf{u}\)

## Algorithm (4.1 of Ma 2004)

Compute \[G(\mathbf{x}) = \begin{bmatrix} \sum I_x^2 & \sum I_xI_y \\ \sum I_xI_y & \sum I_y^2 \end{bmatrix}\] and \[b(\mathbf{x},t) = \begin{bmatrix} \sum I_xI_t \\ \sum I_yI_t\end{bmatrix}\] at every pixel \(\mathbf{x}\).

**Feature tracking**choose feature points \(x_1,x_2,\ldots\) where \(G(\mathbf{x})\) is invertible**Optical flow**choose points \(x_1,x_2,\ldots\) on a fixed grid

\[\mathbf{u}(x,t) = \begin{cases} - G^{-1}b&\quad\text{if defined}\\ 0&\quad\text{otherwise} \end{cases} \]

**Feature tracking**at time \(t+1\) replace point \(x\) by \(x+\mathbf{u}(x,t)\)**Optical flow**at time \(t+1\) repeat operation at the same point \(x\)

### Practical implementation

- Typically, the Sobel filter is appropriate for the spatial derivatives \(\nabla I\)
- On the time axis, it is common to use a first order approximation (difference) for the derivative \(I_t\)

## Challenge: Project

In a week or two, we will dedicate three sessions for a more substantial project. This includes one self-study session 12 October, and one session before it and one session after it.

The goal of the project is to make a prototype of an object tracker, that can identify an object in a video stream and track it, frame by frame.

One approach for this project is the one we have introduced in this session, building on last. First, detect the corners and then track them differentially in the next frame. This is not the only approach, however, and in practice one may have to combine several techniques to get a robust solution. It is a good idea to start thinking about the project at this point, and if you have time, implement the the feature point tracker sooner rather than later.