# Tracking Features Lecture

## Changes from 3ae5ef8c3c2013e6a19037bb1b2159ce012871a3 to 0d33616a0ab52350e4c7ebfce26d29de8e17411e

---
title: Tracking Features
categories: lecture
---

- aperture problem [page 78]

# Briefing

## Corners

![Universitetsområdet i Ålesund](Images/ntnuaes1.jpg)

![Universitetsområdet i Ålesund (ny vinkel)](Images/ntnuaes2.jpg)

+ What are distinctive points in the image?
+ Distinctive points can (to some extent) be matched in two different images.

[More Images](https://www.flickr.com/photos/ntnu-trondheim/collections/72157632165205007/)

## Corner Correspondence

+ Two images of the same scene
$I_1,I_2: \Omega\subset\mathbb{R}^2\to\mathbb{R}_+ ; \mathbf{x}\mapsto I_1(\mathbf{x}),I_2(\mathbf{x})$
+ Different in general

*Why are they different?*

+ Firstly - different colour in different directions
+ Lambertian assumption
+ Secondly - noise
+ let's assume this is insignificant
+ Thirdly - same point in different positions
+ different image points $\mathbf{x}_1,\mathbf{x}_2$
correspond to the same 3D point $p$

**Note** Two approaches:

1.  **Today** Motion modelling using differentiation
+ adequate for slow motion, say 2-3 pixels per frame
+ allows for subpixel accuracy
2.  *Next Session:* feature descriptors

## Motion Model

Suppose we have a feature point $\mathbf{x}_1$ in Image 1.
The corresponding feature $\mathbf{x}_2$ in Image 2 is the result of
a movement $h:\mathbb{R}^2\to\mathbb{R}^2$ so that
$\mathbf{x}_2=h(\mathbf{x}_1)$.

### Brightness Constancy Constraint

+  Suppose we photograph empty space except for a single point $p$
- *Brightness Constancy Constraint*

$$I_1(\mathbf{x}_1) = I_2(\mathbf{x}_2) \sim \mathcal{R}(p)$$

+  Simple dislocation from $\mathbf{x}_1$ to $\mathbf{x}_2$
+  Motion: $h: \mathbf{x}_1\mapsto\mathbf{x}_2$ so that
$$\forall \mathbf{x}_1\in\Omega\cap h^{-1}(\Omega)\subset\mathbb{R}^{2}, \;I_1(\mathbf{x}_1) = I_2(h(\mathbf{x}_1))$$

## Feature Tracking

+ Estimator
$$\hat h = \arg\min_h\sum_{\tilde{\mathbf{x}}\in W(\mathbf{x})} ||I_1(\tilde{\mathbf{x}})-I_2(h(\tilde{\mathbf{x}}))||^2$$

+ The window, or aperture, $W(\vec{x})$
+ Choose $h$ from a family of functions, parameterised by $\alpha$
+ translational: $\alpha=\Delta\mathbf{x}$
+ affine: $\alpha=\{A,\mathbf{d}\}$
+ **Aperture problem:** cannot distinguish points on a blank wall

<details>
<summary>Caveat: AlternativeMotion Models</summary>
### Motion Models

+ **Translational Motion Model:**
$$h(\mathbf{x}_1) = \mathbf{x}_1 + \mathbf{\Delta x}$$
+ **Affine Motion Model:**
$$h(\mathbf{x}_1) = A\mathbf{x}_1 + \mathbf{d}$$
+ **Projective Motion Model:**
$$h(\mathbf{x}_1) = H\mathbf{x}_1$$
where
$H\in\mathbb{R}^{3\times3}$ is defined up to a scalar factor.
</details>

<details>
<summary>Caveat: Intencity Transformation</summary>
### Intencity Transformation

+ Need to accept changes to the intencity

$$I_1(\mathbf{x}_1) = I_2(h(\mathbf{x}_1)) + n(h(\mathbf{x}_1))$$

+ Occlusions
+ Non-Lambertian reflection
+ Taken at different time?  Different ambient light?
</details>

## Feature Tracking

+ Estimator
$$\hat h = \arg\min_h\sum_{\tilde{\mathbf{x}}\in W(\mathbf{x})} ||I_1(\tilde{\mathbf{x}})-I_2(h(\tilde{\mathbf{x}}))||^2$$

+ The window, or aperture, $W(\vec{x})$
+ Choose $h$ from a family of functions, parameterised by $\alpha$
+ translational: $\alpha=\Delta\mathbf{x}$
+ affine: $\alpha=\{A,\mathbf{d}\}$
+ **Aperture problem:** cannot distinguish points on a blank wall

### Infinitesimal Model

+ Consider simple translational model
$$I_1(\textbf{x})= I_2(h(\textbf{x}))= I_2(\textbf{x}+\Delta\textbf{x})$$
+ Consider infitesimally small $\Delta\textbf x$
+ Model on a time axis
+ two images taken infinitesimally close in time
+ ... under motion

1. First write $\mathbf{\Delta x} = \mathbf{u}dt$, and
rewrite the brightness constancy
$$I(\mathbf{x}(t),t) = I(\mathbf{x}(t)+\mathbf{u}dt,t+dt)$$
2.  Apply Taylor series expansion and ignore higher-order terms
$$\nabla I(\mathbf{x}(t),t)^\mathrm{T}\mathrm{u}dt + I_t(\mathbf{x}(t),t)dt = 0$$
$$\nabla I(\mathbf{x}(t),t)^\mathrm{T}\mathrm{u} + I_t(\mathbf{x}(t),t) = 0$$
where
$$\nabla I(\mathbf{x},t) = \begin{bmatrix} I_x(\mathbf{x},t)\\ I_y(\mathbf{x},t) \end{bmatrix} = \begin{bmatrix}\frac{\partial I}{\partial x}(\mathbf{x},t)\\ \frac{\partial I}{\partial y}(\mathbf{x},t) \end{bmatrix} \in\mathbb{R}^2$$
and
$$I_t(\mathbf{x},t) = \frac{\partial I(\mathbf{x},t)}{\partial t}\in \mathbb{R}$$
3. Simplify
$$\nabla I(\mathbf{x}(t),t)^\mathrm{T}\mathrm{u} + I_t(\mathbf{x}(t),t) = 0$$

+ *Brightness Constancy Constraint* for the simplest possible continuous model
+ Two applications
- optical flow: fix a position $\mathbf x$ and consider particles passing through
- feature tracking: fix a particle $x(t)$ an track it through space

### Solving for $\textbf{u}$

+ Consider the equation
$$\nabla I^\mathrm{T}\mathrm{u} + I_t = 0$$
+ There are infititly many solutions, due to the *aperture problem*
+ We can solve for the component in the direction of the gradient though
+ This is our **target equation**.
+ There are infititly many solutions,
+ *aperture problem*
+ one scalar equation with two unknowns ($\mathrm{u}$ is a 2D vector)
+ We can solve for the component in the direction of the gradient though

1. Scalar projection of $\mathbf u$ onto $\nabla I$.
$$\frac{\nabla I^\mathrm{T}\mathrm{u}}{||\nabla I||} = - \frac{I_t}{||\nabla I||}$$
2. Multiplying by $\nabla I/||\nabla I||$, we get the vector projection:
$$\mathbf u_n = \frac{\nabla I^\mathrm{T}\mathrm{u}}{||\nabla I||}\cdot\frac{\nabla I}{||\nabla I||} = - \frac{I_t}{||\nabla I||}\cdot\frac{\nabla I}{||\nabla I||}$$

### Least squared errors estimate

+ Consider again the *target equation*
$$\nabla I^\mathrm{T}\mathrm{u} + I_t = 0$$
+ Equality is unrealistic - too much noise
+ Minimise instead the error: $$E=\nabla I^\mathrm{T}\mathrm{u} + I_t$$
+ Integrate over a window with sufficient texture
+ allows us to estimate $u$ in two dimensions
+ too many approximations for exact solution
+ Minimise the sum of squared errors:
$$E_b(\mathbf{u}) = \sum_{W(\mathbf{x})} [\nabla I^T(\tilde{\mathbf{x}}(\mathbf{u}(\mathbf{x})+I_t(\tilde{\mathbf{x}},t)]^2$$
+ Minimise the sum of squared errors over the window:
$$E_b(\mathbf{u}) = \sum_{W(\mathbf{x})} [\nabla I^T(\tilde{\mathbf{x}},t)\mathbf{u}(\mathbf{x})+I_t(\tilde{\mathbf{x}},t)]^2$$
+ The solution is standard optimisation by calculus
+ cf. least squares regression in statistics
+ cf. extremal points in first-year calculus
+ Differentiate
$$\nabla E_b(\mathbf{u}) = 2\sum_{W(\mathbf{x})} \nabla I [\nabla I^T\mathbf{u}+I_t]$$
+ Spelling out the matrices, we have
$$\nabla E_b(\mathbf{u}) = 2\sum_{W(\mathbf{x})} \bigg(\begin{bmatrix} I_x^2 & I_xI_y \\ I_xI_y & I_y^2\end{bmatrix}\mathbf{u} + \begin{bmatrix} I_xI_t \\ I_yI_t\end{bmatrix}\bigg)$$
+ To minimise $E_b$, the derivative should be zero
$$0 = \begin{bmatrix} \sum I_x^2 & \sum I_xI_y \\ \sum I_xI_y & \sum I_y^2 \end{bmatrix}\mathbf{u} + \begin{bmatrix} \sum I_xI_t \\ \sum I_yI_t\end{bmatrix}$$
+ We refer to the first matrix as $G$, so that
$$G\mathbf{u} + \mathbf{b} = 0$$
+ If $G$ is non-singular, we have
$$\mathbf{u} = - G^{-1}\mathbf{b}$$
+ This gives us the motion vector $\mathbf{u}$

## Algorithm (4.1 of Ma 2004)

Compute
$$G(\mathbf{x}) = \begin{bmatrix} \sum I_x^2 & \sum I_xI_y \\ \sum I_xI_y & \sum I_y^2 \end{bmatrix}$$
and
$$b(\mathbf{x},t) = \begin{bmatrix} \sum I_xI_t \\ \sum I_yI_t\end{bmatrix}$$
at every pixel $\mathbf{x}$.

+ **Feature tracking** choose feature points $x_1,x_2,\ldots$
where $G(\mathbf{x})$ is invertible
+ **Optical flow** choose points $x_1,x_2,\ldots$ on a fixed grid

$$\mathbf{u}(x,t) = \begin{cases} - G^{-1}b&\quad\text{if defined}\\ 0&\quad\text{otherwise} \end{cases}$$

+ **Feature tracking** at time $t+1$ replace point $x$ by
$x+\mathbf{u}(x,t)$
+ **Optical flow** at time $t+1$ repeat operation at the same point $x$

### Practical implementation

+ Typically, the Sobel filter is appropriate for the spatial derivatives $\nabla I$
+ On the time axis, it is common to use a first order approximation (difference)
for the derivative $I_t$