Revision c292bf12755c75cb720eead58763e1a69ab15216 (click the page title to view the current version)

Synthetic Experiment

Changes from c292bf12755c75cb720eead58763e1a69ab15216 to 67ba0c8ec1279fbadde4ff48ebbf641c6d35d966

---
title: Synthetic Experiment
categories: session
---

# Briefing

Sorry for the change of programme.
I found, last minute, that doing the eight-point algorithm with
uncalibrated views will generate too many problems.
We should therefore do a little more with the uncalibrated algorithm.


## Reflection

(This is the same exercise as given yesterday)

Consider the material covered in the sessions 21-22 October.

Do *you* know,

1. how to calculate 3D co-ordinates of an object point $p$,
   given image points $x_1$ and $x_2$ in two separate camera frames?
   What do you require to do so?
2. recover the relative pose of two cameras, given matching image
   points in the two frames?
   How many such image points do you require?
3. If the answer is no to either 1 or 2,
   - what knowledge do you have to build on?
   - what do you need to find out?

# Exercise

Today's exercise is a repeat of the [Eight-point algorithm](),
but now working off a synthetic data set.

Note.  You can choose, in this exercise, how much you want to 
compute by hand, and what you want to implement in python.

Note that Step 1-3 set up a synthetic data set, and are not 
part of the solutions for the reconstruction problem.  
Instead they illustrate the model for image capture.
Step 4-7 solve the reconstruction problem.
Because you have a synthetic data set, Step 1-3 actually provides
you the ground truth, with which you can compare the solutions
found in Steps 6-7.


## Step 1.

Select eight points in general position in 3D.  For instance
$$(-20,0,10),
(20,0,10),
(0,20,10),
(-10,10,20),
(10,10,20),
(0,-10,20),
(0,0,25),
(-7,7,30),
$$(-20,0,25),
(20,0,25),
(0,20,25),
(-10,10,50),
(10,10,50),
(0,-10,50),
(0,0,60),
(-7,7,70),
$$
or generate some at random if you prefer.

**Note**  
The exercise has been updated to
move the points further from the camera (increasing their
$z$-coordinates).
It seems that at least one point in the first version fell behind
the image plane in the rotated camera frame.

## Step 2.

Consider a camera posed in the world frame.  
I.e. the selected points have co-ordinates in the camera frame.
Assume the image plane has co-ordinates $z=1$.

+ Find the image point for each world point, using an ideal camera projection.

## Step 3.

Consider a second camere which is translated ten units along
the $x$-axis and rotated $30^\circ$ degrees around the $y$-axis.
Again the image plane is $z=1$.

+ Find the transformation (relative pose) $(R,T)$.
+ Find (3D) co-ordinates in the second camera frame for each world point.
+ Find the image points in the second camera frame for each world point.

## Step 4.

Imagine now that we do not know the world points.
We are going to try to reconstruct them using the 
[Eight-point algorithm]().

1. Construct the matrix $\chi$ with the Kronecker products of
   the eight image pairs.
2.  Compute the singular value decomposition of 
    $\chi=U_\chi\Sigma_\chi V_\chi^T$
    (probably using `numpy.linalg.svd`)
    - Inspect the three components.  What do you see?
    - The middle component should be a diagonal matrix containing the
      singular values.  How are they ordered?
4.  According to Ma (2004:121) the serialised fundamental matrix $E^s$
    is the ninth column of $V_\chi$.
    - this should be wrig
5.  Unserialise $E^s$ to get the preliminary matrix $E$.
    - you can do this with the numpy `reshape` function, but test it.
      You may have to transpose the resulting matrix.
  
## Step 5.  Projection onto the Essential Space

If $E=U\Sigma V^T$ is our preliminary  matrix as found above,
we approximate the essential
matrix by replacing $\Sigma$ with $\mathsf{diag}(1,1,0)$
(cf. Ma (2004:121)).

## Step 6.  Rotation and Translation

Recover Translation and Rotation matrices from the essential matrix,
relying on Ma (2004:121).

+ What do the two solutions represent?
+ Do they match the actual transformation?

**Remember** there are four solutions for $(R,T)$, three of
which produce objects point behind either or both cameras.
Remember to check both $E$ and $-E$, in addition to the two
solutions for $(R,T)$ from one choice for $E$.

**Note**
If the transformation you find seem off, typically with errors
of sign, there are at least two plausible explanations.

1.  You have not checked all four solutions (see above).
2.  Some object point from Step 1 actually does end up behind
    the image plane in the second camera frame.  If this happens,
    you should try to choose different object points.

## Step 7.  Reconstruction in 3D

Consider one pair of image points together with the transformation 
$(R,T)$.

+ Calculating the $z$-coordinate $\lambda$
  of the 3D point as we did in [Relative Pose]().
  You know that the distance between the camera origins is $||T||=10$.
+ Invert the projection to recover the complete coordinates of the
  3D point.
+ Does the recovered point match the original point?
+ If you have time, repeat with a second and possibly a third point.
  You probably do not want to do all eight, though.

# Debrief