Revision b3baef66a9f9061187daf16d2afb75807ab15953 (click the page title to view the current version)

Tracking Features

Changes from b3baef66a9f9061187daf16d2afb75807ab15953 to current

---
title: Tracking Features
categories: session
---

- aperture problem [page 78]

# Briefing

## Corners

![Universitetsområdet i Ålesund](Images/ntnuaes1.jpg)

![Universitetsområdet i Ålesund (ny vinkel)](Images/ntnuaes2.jpg)

+ What are distinctive points in the image?
+ Distinctive points can (to some extent) be matched in two different images.

[More Images](https://www.flickr.com/photos/ntnu-trondheim/collections/72157632165205007/)

## Corner Correspondence

+ Two images of the same scene $I_1,I_2: \Omega\subset\mathbb{R}^3\to\R_+ ; \mathbf{x}\mapsto I_1(\mathbf{x}),I_2(\mathbf{x})$
+ Different in general 

*Why are they different?*

## Brightness Constancy Constraint

+  Suppose we photograph empty space except for a single point $p$
    - *Brightness Constancy Constraint*

$$I_1(\mathbf{x}_1) = I_2(\mathbf{x}_2) \sim \mathcal{R}(p)$$

+  Simple dislocation from $\mathbf{x}_1$ to $\mathbf{x}_2$ 
+  Problem: Globally, it is an infinite-dimentional transformation
+  Motion: $h: \mathbf{x}_1\mapsto\mathbf{x}_2$
    + so that $I_1(\mathbf{x}_1) = I_2(h(\mathbf{x}_1)) \forall
      \mathbf{x}_1\in\Omega\cap h^{-1}(\Omega)\subset\mathbb{R}^{2\times2}$
      
**Date** 24 September 2021

**Briefing** [Tracking Features Lecture]()

## Motion Models
# Exercises

+ Affine Motion Model: $h(\mathbf{x}_1) = A\mathbf{x}_1 + \mathbf{d}$
+ Projective Motion Model: $h(\mathbf{x}_1) = H\mathbf{x}_1$ where
  $H\in\mathbb{R}^{3\times3}$ is defined up to a scalar factor.
+ Need to accept changes to the intencity
## Algebraic Formulation

## Aperture Problem
+ Exercise 4.1 (Ma 2004)
    + Let $\mathbf{X}$ be a 3D point and $\mathbf{x}$ its projection 
    + Consider the image motion $h(\mathbf{x}) = \mathbf{x}+\Delta\mathbf{x}$
    + What transformation $(R,T)$ must the scene undergo to relalise
      the given $h(x)$? 
    + *Hint*
        + We remember that $(R,T)$ has six degrees of freedom in general.
        + To support this given $h$, some of these six degrees have to be
          fixed.  Which ones?

## Experimental View 

$$\hat h = \arg\min_h\sum_{\tilde\mathbf{x}\in W(\mathbf{x})} ||I_1(\tilde\mathbf x)-I_2(\tilde\mathbf x)||^2$$
Consider the image of a square, say, with corners at 
$$(0,0); (10,0); (0,10); (10,10),$$
and visualise the results of different motions.
You can programme this in Python, or do the calculations by hand.
Given that it is 2D, hand calculation is not impractical.
Python is faster when you know how, but you need to look up a 2D API
instead of reusing the 3D functions we have used for previous motion
exercises.

+ The window, or aperture, $W(\vec{x})$
+ cannot distinguish points on a blank wall
For each of the following models, apply the transformation and plot the result.

+ Choose $h$ from a family of functions, parameterised by $\alpha$
    + translational: $\alpha=\Delta\mathbf{x}$
    + affine: $\alpha=\{A,\mathbf{d}\}$
1.  Translation $\Delta x$, e.g. $\Delta x=(6,4)$
1.  Affine: $x \mapsto Ax + d$ for an invertible matrix $A$ and a translation vector $d$.
    E.g.
    $$A = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}
    \quad d = (4,10)$$
2.  Projective: $x \mapsto Hx$ where $x$ is in homogeneous co-ordinates and $H$ is defined up
    to a scalar factor. E.g.
    $$H = \begin{bmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}$$
    Not that the resulting point is given in homogenous co-ordinates,
    and we need to scale it so that the third co-ordinate is 1, i.e.
    $(x,y,z)\mapsto(x/z,y/z)$.
3.  If you find a way to do this in Python, you should try multiple transformations to see 
    how they behave in practice.

## Feature Tracking
## If you have time 

$$I_1(\textbf{x})= I_2(h(\textbf{x}))= I_2(\textbf{x}+\Delta\textbf{x})$$
+ Exercises 4.2, 4.3 (Ma 2004)
+ Exercises 4.4 (Ma 2004)
+ Alternatively, you may start implementing the feature tracker in Python/OpenCV
  or another language.

+ Consider infitesimally small $\Delta\textbf x$
# Debrief

## Infinitesimal Model
## Exercise 4.1 (Ma 2004)

+ Model on a time axis - two images taken infinitesimally close in time
    + ... under motion
**Note** We discussed the solution in the debrief session,
and this discussion is far more instructive than the algebraic
solution outlined below.  Yet, this sketch may add some other 
insights.


$$I(\mathbf{x}(t),t) = I(\mathbf{x}(t)+t\mathbf{u},t+dt)$$

$$\nabla I(\mathbf{x}(t),t)^\mathrm{T}\mathrm{u} + I_t(\mathbf{x}(t),t) = 0$$

$$\nabla I(\mathbf{x},t) = \begin{bmatrix} I_x(\mathbf{x},t)\\ I_y(\mathbf{x},t) \end{bmatrix}
= \begin{bmatrix}\frac{\partial I}{\partial x}(\mathbf{x},t)\\ \frac{\partial I}{\partial y}(\mathbf{x},t) \end{bmatrix}
\in\mathbb{R}^2$$

$$I_t(\mathbf{x},t) = \frac{\partial I}{\partial t}(\mathbf{x},t)\in \mathbb{R}$$
   
*Brightness Constancy Constraint* for the simplest possible continuous model

+ Two applications
    - optical flow: fix a position $\mathbf x$ and consider particles passing through
    - feature tracking: fix a partical $x(t)$ an track it through space 
**Sketch for a solution** 
Write 
$$\mathbf{x}=\Pi\mathbf{X},$$
where $\Pi$ is the projection matrix, including the camera parameters.

## Solving for $\textbf{u}$
After the transformation, we have
$$\mathbf{x}+\Delta\mathbf{x}=\Pi(R\mathbf{X}+T)=\Pi R\mathbf{X}+\Pi T$$

$$\nabla I^\mathrm{T}\mathrm{u} + I_t = 0$$
Inserting for $\mathbf{x}$ on the left hand side, we have
$$\Pi\mathbf{X}+\Delta\mathbf{x}=\Pi R\mathbf{X}+\Pi T$$

+ There are infititly many solutions, due to the *aperture problem*
+ We can solve for the component in the direction of the gradient though
To satisfy this for every $\mathbf{X}$, we require that
$$\Pi = \Pi R$$
and
$$\Delta\mathbf{x}=\Pi T$$

$$\frac{\nabla I^\mathrm{T}\mathrm{u}}{||\nabla I||} = - \frac{I_t}{||\nabla I||} $$
Recall that $\Pi$ is given as
$$\Pi=\frac1Z\cdot
\begin{bmatrix}
f & 0 & 0 \\
0 & f & 0 \\
0 & 0 & 1 
\end{bmatrix}
$$
Note that the projection matrix depends on the $Z$-coordinate of the
3D point.

+ Left hand side is the scalar projection of $\mathbf u$ onto $\nabla I$.
+ Multiplying by $\nabla I/||\nabla I||$, we get the vector projection:
Since $\Pi=\Pi R$, we must have $R=I$, i.e. there is no rotation.

$$\mathbf u_n = \frac{\nabla I^\mathrm{T}\mathrm{u}}{||\nabla I||}\cdot\frac{\nabla I}{||\nabla I||} = - \frac{I_t}{||\nabla I||\cdot\frac{\nabla I}{||\nabla I||}} $$
Let us write $\Delta\mathbf{x} = (\Delta x, \Delta y,0)$
and $T^T=(x_t,y_t,z_t)$, to get
$$\Delta x = \frac{fx_t}{Z}\quad \Delta y = \frac{fy_t}{Z}$$
Furthermore, 
$$0=\frac{fz_t}{Z},$$
and hence $z_t=0$.

We conclude that the transformation $(R,T)$ has to be a translation 
parallel to the image plane.