Relative Pose

Reading Ma 2004 Chapter 5

Main Features

  1. Triangulation allows us to calculate the depth of a point in 3D using the relative pose, i.e. the co-ordinate transform \((R,T)\) between the two camera frames.
  2. The Essential Matrix describes the relationship between the two camera frames.
  3. In the next session we will
    1. Use the eight point algorithm to find the essential matrix
    2. Decompose the Essential Matrix to get the co-ordinate transformation \((R,T)\) between the camera frames.

The Epipolar Constraint

Epipolar constraint due to Hamidur Rahman
Epipolar constraint due to Hamidur Rahman
Epipolar constraint due to Arne Nordmann (norro) - CC BY-SA 3.0
Epipolar constraint due to Arne Nordmann (norro) - CC BY-SA 3.0
  • Two cameras, two co-ordinate frames
  • Relative Pose: transformation \((R,T)\) from Camera Frame 1 to Camera Frame 2.
  • Consider a point \(p\) in 3D, it has
    • co-ordinates \(\mathbf{X}_i\) in Camera Frame \(i\)
  • The projection of \(\mathbf{X}_i\) in homogeneous co-ordinates is called \(\mathbf{x}_i\)
    • \(\mathbf{X}_i = \lambda_i\mathbf{x}_i\)
    • Note \(\mathbf{x}_i\) is 3D co-ordinates if the image plane is normalised to \(z=1\)
  • Combining this with the relation \(\mathbf{X}_2=R\mathbf{X}_1+T\), we get \[\lambda_2\mathbf{x}_2 = R\lambda_1\mathbf{x}_1 + T\]
  • The epipolar plane \(P\) is spanned by \(T\) and \(p\)
  • Multiply by \(\hat T\) to get a vector orthogonal on \(P\) \[\lambda_2\hat T\mathbf{x}_2 = \hat TR\lambda_1\mathbf{x}_1 + \hat TT\]
  • The last term is zero because \(T\times T=0\) \[\lambda_2\hat T\mathbf{x}_2 = \hat TR\lambda_1\mathbf{x}_1\]
  • Now \(\mathbf{x}_2\) is perpendicular on \(T\times\mathbf{X}_2\) so \[0=\mathbf{x}_2^T\lambda_2\hat T\mathbf{x}_2 = \mathbf{x}_2^T\hat TR\lambda_1\mathbf{x}_1\]
  • Since \(\lambda_1\) is a scalar, we can simplify \[0= \mathbf{x}_2^T\hat TR\mathbf{x}_1\] This is the epipolar constraint
  • \(E=\hat TR\) is called the essential matrix

Epipolar Entities

  • For each point \(p\), we have
    • a epipolar plane \(\langle o_1,o_2,p\rangle\)
    • epipolar line \(\ell_i\) as the intersection of the epipolar plane and the image plane
  • The epipoles \(\mathbf{e}_i\) is the projection of origo onto the image plane of the other camera.
  • Note that the epipoles are on the line \(\langle o_1,o_2\rangle\), and hence in the epipolar plane

A plane in 3D has two descriptions, as a span or as a null space. In particular, the epipolar plane is

  1. \(P = \langle x_1,T\rangle = \{ ax_1+bT | a,b\in\mathbb{R}\}\)
    • note that this does not depend on any co-ordinate frame
  2. Null space of \(Ex_1=T\times Rx_1\), i.e. \(P = \{ x\in\mathbb{R}^3 | x^TEx_1 \}\)
    • Note that \(R x_1\) is a vector from \(O_2\) to the image point \(x_1\).

When the textbook defines the epipolar line as the null space of \(\hat TR\mathrm{x}_1\), this is a simplification. The null space is the epipolar plane, and the epipolar line is only those points that also fall in the image plane. Two equations are needed to define the epipolar line in 3D:

  1. \(x^TEx_1=T\times Rx_1\)
  2. \(x^T[0,0,1]^T=1\) (i.e. \(z=1\))

When we say that \(\ell_2\sim Ex_1\), it is abuse of terminology. It is the co-image of \(\ell_2\) of \(Ex_1\). The actual line (set of points) is those points that are orthogonal on \(Ex_1\). This is not a problem when we are aware of the abuse.

Note that we work on 3D points all the way. We have not defined an origin in the image plane, and trying to see why \(x_2\) and \(Ex_1\) be orthogonal as 2D vectors is not helpful.

Some properties

Proposition 5.3(1)

\[\mathbf{e}_2^TE = E\mathbf{e}_1=0\]

  • This is because
    1. \(\mathbf{e}_2\sim T\) and \(\mathbf{e}_1\sim R^TT\)
    2. \(E=\hat TR\)
    3. \(T\hat T = T\times T=0\)

Proposition 5.3(2)

Proposition 5.3(3)

  • Both the image point and the epipole lie on the epipolar line

Decomposition of the Essential Matix

  • Theorem 5.5

\[E = U\mathsf{diag}\{\sigma,\sigma,0\}V^T,\] where \(U,V\in\mathsf{SO}(3)\)

  • Tricky proof. Do not spend too much time on this.

\[ \begin{cases} (\hat T_1,R_1) &= (UR_Z(+\frac\pi2)\Sigma U^T, UR_Z(+\frac\pi2)V^T) \\ (\hat T_2,R_2) &= (UR_Z(-\frac\pi2)\Sigma U^T, UR_Z(-\frac\pi2)V^T) \end{cases} \] where \[R_Z(+\frac\pi2) = \begin{bmatrix} 0 & -1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \] is a rotation by \(\pi/2\) radians around the \(z\)-axis.

  • Note that there are two solutions from one \(U\Sigma V^T\) decomposition.
  • Are there more solutions?

Two Relative Poses

There exist exactly two relative poses \((R,T)\) with \(R\in\mathsf{SO}(3)\) and \(T\in\mathbb{R}^3\) corresponding to a nonzero essential matrix \(E\in\mathcal{E}\) Theorem 5.7

Repetition

  • Rotation by angle \(\theta\) around the vector \(\omega\) is given by \(R=e^{\hat\omega\theta}\) assuming \(\omega\) has unit length.

Rodrigues’ formula (2.16)

\[e^{\hat\omega} = I + \frac{\hat\omega}{||\omega||}\sin(||\omega||) + \frac{\hat\omega^2}{||\omega||^2}(1-\cos(||\omega||))\]

See Angular Motion for a more comprehensive summary.

Theorem 5.7

  • Demo read the proof (debrief?)

Lemma 5.6

If \(\hat T\) and \(\hat TR\) are both skew-symmetric for \(R\in\mathrm{SO}(3)\), then \(R\) is a rotation by angle \(\pi\) around \(T\).

  • Demo read the proof (debrief?)

  • Skew-symmetry gives \((\hat TR)^T=-\hat TR\)
  • We also have \((\hat TR)^T=R^T\hat T^T=-R^T\hat T\)
  • Hence \(\hat TR = R^T\hat T\),
  • and since \(R^T=R^{-1}\), we have \[R\hat TR=\hat T\]
  • Write \(R=e^{\hat\omega\theta}\) for some \(\omega\) of unit length and some \(\theta\), to get \[e^{\hat\omega\theta}\hat Te^{\hat\omega\theta}=\hat T\]
  • multiply by \(\omega\) \[e^{\hat\omega\theta}\hat Te^{\hat\omega\theta}\omega=\hat T\omega\] This represents a stationary rotation of the vector \(\hat T\omega\).

Note that \(\omega\) is stationary under rotation by \(R\), and hence it is an eigenvector associated with eigenvalue 1. Furthermore, it is the only such eigenvector, and \(\hat T\omega\) cannot be such. Hence \(\hat T\omega=T\times\omega=0\). This is only possible if \(T\sim\omega\), and since \(\omega\) has unit length, we get \[\omega = \pm\frac{T}{||T||}\]

We now know that \(R\) has to be a rotation around \(T\), and therefore \(R\) and \(T\) commute. This can be checked in Rodrigues’ formula (Theorem 2.9).

Hence \(R^2\hat T = \hat T\). This looks like two half-round rotations to get back to start. If \(\hat T\) had been a vector or a matrix of full rank, we would have been done. However, with the skew-symmetric \(\hat T\) there is a little more fiddling.

Triangulation

Question How do we find \(\lambda\) when \((R,T)\) is known?

  1. We can find the angle between image point \(x\) and epipole \(e\) \[\cos\theta = \frac{x\cdot e}{||x||\cdot||e||}\]
  2. We can reconstruct \(\lambda\) using the sine law \[ \frac{\sin\theta_{\mathrm{II}}}{\lambda_2}= \frac{\sin\theta_{\mathrm{I}}}{\lambda_1} = \frac{\sin\theta_{0}}{||T||}\]
  3. It is also possible to use the cosine law