Revision 3ae5ef8c3c2013e6a19037bb1b2159ce012871a3 (click the page title to view the current version)

Tracking Features Lecture

Changes from 3ae5ef8c3c2013e6a19037bb1b2159ce012871a3 to 0d33616a0ab52350e4c7ebfce26d29de8e17411e

---
title: Tracking Features
categories: lecture
---

- aperture problem [page 78]

# Briefing

## Corners

![Universitetsområdet i Ålesund](Images/ntnuaes1.jpg)

![Universitetsområdet i Ålesund (ny vinkel)](Images/ntnuaes2.jpg)

+ What are distinctive points in the image?
+ Distinctive points can (to some extent) be matched in two different images.

[More Images](https://www.flickr.com/photos/ntnu-trondheim/collections/72157632165205007/)

## Corner Correspondence

+ Two images of the same scene 
  $I_1,I_2: \Omega\subset\mathbb{R}^2\to\mathbb{R}_+ ; \mathbf{x}\mapsto I_1(\mathbf{x}),I_2(\mathbf{x})$
+ Different in general 

*Why are they different?*

+ Firstly - different colour in different directions
    + Lambertian assumption
+ Secondly - noise
    + let's assume this is insignificant
+ Thirdly - same point in different positions
    + different image points $\mathbf{x}_1,\mathbf{x}_2$
      correspond to the same 3D point $p$

**Note** Two approaches:

1.  **Today** Motion modelling using differentiation
    + adequate for slow motion, say 2-3 pixels per frame
    + allows for subpixel accuracy
2.  *Next Session:* feature descriptors 

## Motion Model

Suppose we have a feature point $\mathbf{x}_1$ in Image 1.
The corresponding feature $\mathbf{x}_2$ in Image 2 is the result of
a movement $h:\mathbb{R}^2\to\mathbb{R}^2$ so that
$\mathbf{x}_2=h(\mathbf{x}_1)$.

### Brightness Constancy Constraint

+  Suppose we photograph empty space except for a single point $p$
    - *Brightness Constancy Constraint*

$$I_1(\mathbf{x}_1) = I_2(\mathbf{x}_2) \sim \mathcal{R}(p)$$

+  Simple dislocation from $\mathbf{x}_1$ to $\mathbf{x}_2$ 
+  Motion: $h: \mathbf{x}_1\mapsto\mathbf{x}_2$ so that
   $$\forall
     \mathbf{x}_1\in\Omega\cap h^{-1}(\Omega)\subset\mathbb{R}^{2},
     \;I_1(\mathbf{x}_1) = I_2(h(\mathbf{x}_1))$$
      

## Feature Tracking

+ Estimator
  $$\hat h = \arg\min_h\sum_{\tilde{\mathbf{x}}\in W(\mathbf{x})}
    ||I_1(\tilde{\mathbf{x}})-I_2(h(\tilde{\mathbf{x}}))||^2$$

+ The window, or aperture, $W(\vec{x})$
+ Choose $h$ from a family of functions, parameterised by $\alpha$
    + translational: $\alpha=\Delta\mathbf{x}$
    + affine: $\alpha=\{A,\mathbf{d}\}$
+ **Aperture problem:** cannot distinguish points on a blank wall

<details>
  <summary>Caveat: AlternativeMotion Models</summary>
### Motion Models

+ **Translational Motion Model:**
  $$h(\mathbf{x}_1) = \mathbf{x}_1 + \mathbf{\Delta x}$$
+ **Affine Motion Model:**
  $$h(\mathbf{x}_1) = A\mathbf{x}_1 + \mathbf{d}$$
+ **Projective Motion Model:**
  $$h(\mathbf{x}_1) = H\mathbf{x}_1$$
  where
  $H\in\mathbb{R}^{3\times3}$ is defined up to a scalar factor.
</details>

<details>
  <summary>Caveat: Intencity Transformation</summary>
### Intencity Transformation

+ Need to accept changes to the intencity

$$I_1(\mathbf{x}_1) = I_2(h(\mathbf{x}_1)) + n(h(\mathbf{x}_1))$$

+ Occlusions
+ Non-Lambertian reflection
+ Taken at different time?  Different ambient light?
</details>

## Feature Tracking

+ Estimator
  $$\hat h = \arg\min_h\sum_{\tilde{\mathbf{x}}\in W(\mathbf{x})}
    ||I_1(\tilde{\mathbf{x}})-I_2(h(\tilde{\mathbf{x}}))||^2$$

+ The window, or aperture, $W(\vec{x})$
+ Choose $h$ from a family of functions, parameterised by $\alpha$
    + translational: $\alpha=\Delta\mathbf{x}$
    + affine: $\alpha=\{A,\mathbf{d}\}$
+ **Aperture problem:** cannot distinguish points on a blank wall

### Infinitesimal Model

+ Consider simple translational model 
  $$I_1(\textbf{x})= I_2(h(\textbf{x}))= I_2(\textbf{x}+\Delta\textbf{x})$$
+ Consider infitesimally small $\Delta\textbf x$
+ Model on a time axis 
    + two images taken infinitesimally close in time
    + ... under motion

1. First write $\mathbf{\Delta x} = \mathbf{u}dt$, and
  rewrite the brightness constancy
  $$I(\mathbf{x}(t),t) = I(\mathbf{x}(t)+\mathbf{u}dt,t+dt)$$
2.  Apply Taylor series expansion and ignore higher-order terms
  $$\nabla I(\mathbf{x}(t),t)^\mathrm{T}\mathrm{u}dt + I_t(\mathbf{x}(t),t)dt = 0$$
  $$\nabla I(\mathbf{x}(t),t)^\mathrm{T}\mathrm{u} + I_t(\mathbf{x}(t),t) = 0$$
  where
  $$\nabla I(\mathbf{x},t) = \begin{bmatrix} I_x(\mathbf{x},t)\\ I_y(\mathbf{x},t) \end{bmatrix}
= \begin{bmatrix}\frac{\partial I}{\partial x}(\mathbf{x},t)\\ \frac{\partial I}{\partial y}(\mathbf{x},t) \end{bmatrix}
\in\mathbb{R}^2$$
  and
  $$I_t(\mathbf{x},t) = \frac{\partial I(\mathbf{x},t)}{\partial t}\in \mathbb{R}$$
3. Simplify
  $$\nabla I(\mathbf{x}(t),t)^\mathrm{T}\mathrm{u} + I_t(\mathbf{x}(t),t) = 0$$

+ *Brightness Constancy Constraint* for the simplest possible continuous model
+ Two applications
    - optical flow: fix a position $\mathbf x$ and consider particles passing through
    - feature tracking: fix a particle $x(t)$ an track it through space 

### Solving for $\textbf{u}$

+ Consider the equation
  $$\nabla I^\mathrm{T}\mathrm{u} + I_t = 0$$
+ There are infititly many solutions, due to the *aperture problem*
    + We can solve for the component in the direction of the gradient though
    + This is our **target equation**.
+ There are infititly many solutions,
    + *aperture problem*
    + one scalar equation with two unknowns ($\mathrm{u}$ is a 2D vector)
+ We can solve for the component in the direction of the gradient though

1. Scalar projection of $\mathbf u$ onto $\nabla I$.
  $$\frac{\nabla I^\mathrm{T}\mathrm{u}}{||\nabla I||} = - \frac{I_t}{||\nabla I||} $$
2. Multiplying by $\nabla I/||\nabla I||$, we get the vector projection:
  $$\mathbf u_n =
    \frac{\nabla I^\mathrm{T}\mathrm{u}}{||\nabla I||}\cdot\frac{\nabla I}{||\nabla I||} =
    - \frac{I_t}{||\nabla I||}\cdot\frac{\nabla I}{||\nabla I||} $$

### Least squared errors estimate

+ Consider again the *target equation*
  $$\nabla I^\mathrm{T}\mathrm{u} + I_t = 0$$
    + Equality is unrealistic - too much noise
    + Minimise instead the error: $$E=\nabla I^\mathrm{T}\mathrm{u} + I_t$$
+ Integrate over a window with sufficient texture
    + allows us to estimate $u$ in two dimensions
    + too many approximations for exact solution 
+ Minimise the sum of squared errors:
  $$E_b(\mathbf{u}) = \sum_{W(\mathbf{x})} [\nabla I^T(\tilde{\mathbf{x}}(\mathbf{u}(\mathbf{x})+I_t(\tilde{\mathbf{x}},t)]^2$$
    + Minimise the sum of squared errors over the window:
      $$E_b(\mathbf{u}) = \sum_{W(\mathbf{x})} [\nabla I^T(\tilde{\mathbf{x}},t)\mathbf{u}(\mathbf{x})+I_t(\tilde{\mathbf{x}},t)]^2$$
+ The solution is standard optimisation by calculus
    + cf. least squares regression in statistics
    + cf. extremal points in first-year calculus
+ Differentiate
  $$\nabla E_b(\mathbf{u}) = 2\sum_{W(\mathbf{x})} 
   \nabla I [\nabla I^T\mathbf{u}+I_t]$$
+ Spelling out the matrices, we have
  $$\nabla E_b(\mathbf{u}) = 2\sum_{W(\mathbf{x})} 
    \bigg(\begin{bmatrix} I_x^2 & I_xI_y \\ I_xI_y & I_y^2\end{bmatrix}\mathbf{u}
    + \begin{bmatrix} I_xI_t \\ I_yI_t\end{bmatrix}\bigg)$$
+ To minimise $E_b$, the derivative should be zero
  $$0 =
    \begin{bmatrix}
       \sum I_x^2 & \sum I_xI_y \\ \sum I_xI_y & \sum I_y^2
    \end{bmatrix}\mathbf{u}
    + \begin{bmatrix} \sum I_xI_t \\ \sum I_yI_t\end{bmatrix}$$
+ We refer to the first matrix as $G$, so that
  $$G\mathbf{u} + \mathbf{b} = 0$$
+ If $G$ is non-singular, we have
  $$\mathbf{u} = - G^{-1}\mathbf{b}$$
+ This gives us the motion vector $\mathbf{u}$

## Algorithm (4.1 of Ma 2004)

Compute
$$G(\mathbf{x}) = \begin{bmatrix}
       \sum I_x^2 & \sum I_xI_y \\ \sum I_xI_y & \sum I_y^2
\end{bmatrix}$$
and
$$b(\mathbf{x},t) = \begin{bmatrix} \sum I_xI_t \\ \sum I_yI_t\end{bmatrix}$$
at every pixel $\mathbf{x}$.

+ **Feature tracking** choose feature points $x_1,x_2,\ldots$
  where $G(\mathbf{x})$ is invertible
+ **Optical flow** choose points $x_1,x_2,\ldots$ on a fixed grid

$$\mathbf{u}(x,t) =
\begin{cases}
- G^{-1}b&\quad\text{if defined}\\
0&\quad\text{otherwise}
\end{cases}
$$

+ **Feature tracking** at time $t+1$ replace point $x$ by 
  $x+\mathbf{u}(x,t)$
+ **Optical flow** at time $t+1$ repeat operation at the same point $x$ 

### Practical implementation

+ Typically, the Sobel filter is appropriate for the spatial derivatives $\nabla I$
+ On the time axis, it is common to use a first order approximation (difference)
  for the derivative $I_t$