Eight-point algorithm (Lecture)

Eight-Point Algorithm
Optimal pose in the presence of noise
- Optimal Pose

Eight-Point Algorithm

Input: At least eight pairs of points \((\mathbf{x}_1,\mathbf{x}_2)\) (from stereo view).
Output: The essential matrix \(E\) (or rather an estimate)
Main Principle: Epipolar Constraint \(\mathbf{x}_2^TE\mathbf{x}_1 = 0\)

Outline

The epipolar constraint is a linear equation, but it is written in an unusual form, with the unknown \(E\) in the middle.
- We know \(\mathbf{x}_i\), and want to solve for \(E\).
- Note if \(E\) is a solution, then \(c\cdot E\) is a solution for any \(c\in\mathbb{R}\)
Up to nine unknowns in the \(3\times3\) matrix \(E\), but one is the indeterminable \(c\).
- Scaling up all distances by the same factor gives the same solution
Step 1. Rewrite the constraint in a more regular format.
Step 2. Formulate a system of linear equations using eight pairs of points.
- Eight pairs of points give eight equations

Kronecker product

Serialisation of a matrix: \((\cdot)^s\) maps a \(3\times 3\) matrix into a 9-dimensional vector.
- You can scan row by row or column by column; and seriously, Matlab does it one way and Python (numpy) the other.
- Numpy method: .flatten()
- You need to be consistent, and it is worth running a test to see what your system does.
Kronecker product: \(\bigotimes\)
- \(\vec{a}\bigotimes \vec{b} = (\vec{a}\vec{b}^\mathrm{T})^s\)
Rewrite the epipolar constraint: \[(\mathbf{x}_1\bigotimes\mathbf{x}_2)^TE^s = 0\]
- Note: one equation in nine unknowns,

Preparing for the eight-point algorithm

Each point of pairs gives rise to a nine-dimensional vector: \[\mathbf{a} = \mathbf{x}_1\bigotimes\mathbf{x}_2\]
- view it as a column
The epipolar constraint is \(\mathbf{a}^\mathrm{T}\cdot E^s = 0\).
This gives a matrix equation \(\chi\cdot E^s=0\) where \[\chi = [\mathbf{a}_1, \mathbf{a}_2, \ldots, \mathbf{a}_n]^T\]
We can solve \(\chi E^s = 0\) for \(E^s\).
- With eight points, we have unique solutions up to a scalar factor.
With additional points, we can minimise the squared errors
- \(\min_E ||\chi E^s||^2\)
- i.e. eigenvector of \(\chi^T\chi\) of smallest eigenvalue
- (this is a general approach to solve linear systems of equations with noise)
The scale ambiguity is intrinsic.
- The distance \(||T||\) cannot be determined from the images

Projection onto the essential space

The solution is not necessarily a valid essential matrix, but we can project onto the space of such matrices and correct the sign to get positive determinant.

Write \(F\) for the solution of \(\chi E^s=0\). This is called the fundamental matrix
Write \(F = U\Sigma_FV^T\) for the
singular value decomposition
- \(\Sigma_F\) is a diagonal matrix
- \(U\) and \(V\) are orthogonal matrices (assuming they are real)
- canonical decomposition with the entries on \(\Sigma_F\) in descending order
- use an API to compute
Use \(E=U\Sigma V^T\) for the essential matrix, where
- \(\Sigma=\mathsf{diag}(1,1,0)\)

Recover the transform from the essential matrix

\[ \begin{cases} (\hat T_1,R_1) &= (UR_Z(+\frac\pi2)\Sigma U^T, UR_Z(+\frac\pi2)V^T) \\ (\hat T_2,R_2) &= (UR_Z(-\frac\pi2)\Sigma U^T, UR_Z(-\frac\pi2)V^T) \end{cases} \] where \[R_Z(+\frac\pi2) = \begin{bmatrix} 0 & -1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \] is a rotation by \(\pi/2\) radians around the \(z\)-axis.

Four solutions
- Two solutions \(\pm F\) to the equation \(\chi E^s=0\)
- Two poses from each candidate matrix
- Only one of the four solutions is in front of both cammeras

Notes

Number of points.

\((R,T)\) only has five degrees of freedom when \(T\) is determined up to a scalar factor.
thus five points suffices in theory
\(\mathsf{det}(E)=0\) removes one degree of freedom
linear algorithm exists for six points

Requirements

General position We have to assume linear independence in the equation \(\chi E^s=0\)

Sufficient parallax If \(T=0\), then \(E=0\), hence the algorithm requires \[T\neq0\] The algorithm may work even if \(T=0\), due to noise, but the resulting estimate for \(T\) (direction) would be meaningless.

Variants

Infinitesimal viewpoint change, see Section 5.4
Multiple motion, see Chapter 7

Scale ambiguity (Ma (2004) Section 5.2.2)

Depth scales \(\lambda_1,\lambda_2\) (for each object point)
Motion scale \(\gamma=||T||\)

\[\lambda_2^j\mathbf{x}_2^j = \lambda_1^jR\mathbf{x}_2^j + \gamma T\]

Recall that the scale ambiguity is intrinsic
\(\gamma\) cannot be determined from the images

Optimal pose in the presence of noise

Reading Ma (2004) Section 5.2.3

Noise is everywhere - Two problems:
- error in \((R,T)\)
- no consistent 3D reconstruction from noisy image points
The epipolar constraint assumes ideal co-ordinates
- but applied to noisy measurements
Two approaches
1. Maximum likelihood
2. Minimal sum of squared errors
We only discuss option 2.

Optimal Pose

Consider an observed point \(\tilde{\textbf{x}}^j=(\tilde x^j_1, \tilde x^j_2, \tilde x^j_3)\) and an unknown true point \(\textbf{x}^j=( x^j_1, x^j_2, x^j_3)\). The corresponding 3D point is \(\textbf{X}^j\).

Observations \(\tilde x_i^j\)
Independent variables \(x_i^j,R,T\)
Sum of squared errors \[\phi(\mathrm{X}^j,R,T) = \sum_j ||w_1^j||^2 + ||w_2^j||^2 = \sum_j|| \tilde x_1^j-x_1^j||^2 + ||\tilde x_2^j-x_2^j||^2 = \sum_j|| \tilde x_1^j-\pi_1(\textbf{X}^j)||^2 + ||\tilde x_2^j-\pi_2(\textbf{X}^j)||^2\]

Task \[\min_{\textbf{X}^j}\phi(\mathrm{X}^j,R,T) = \sum_j\sum_{i\in\{1,2\}} ||w_i^j||^2\] under the conditions that \[ x_2^j\hat TR x_1^j = 0\] and \[x_1^{jT}e_3 = 1\quad\text{for } i=1,2, j=1,2,\ldots,n\]

Lagrangian optimisation

Goal find image point pairs which minimise the reprojection error for that particular object point. (Structure Triangulation)