Lecture - Real World Reconstruction

Reading Ma 2004:Ch 11


  1. Feature Detection
  2. Feature Correspondence
  3. Projective Reconstruction
  4. Euclidean Reconstruction

Feature Detection

  • Harris Detector
  • Tiling
    • divide the image into, say, \(10\times10\) tiles
    • select features per tile
  • Separation
    • one feature may cause several pixels to be marked as a corner
    • separation should be larger than the windows size used in detection
  • Sorting by strength
    • sort first, and then select from top of the list
    • enforce separation from previously selected features

Feature Correspondence

  • Small Baseline (motion video)
    • feature tracking - calculate motion
  • Moderate Baseline (snapshots)
  • Wide Baseline -> use SIFT or similar methods
    • the textbook is outdated on this point

Basic Tracker

  • Recall the use of the gradient
    • Temporal derivative \(I_t\) approximated by difference \(I^2-I^1\)
  • Displacement over 2–3 pixels \(\to\) first-order differences do not suffice
  • Therefore, we use a multiscale approach
    • Successively smoothen and downsample
    • Tracking in coarser scale works for larger displacement (more displacement per pixel)

Multiscale iterative feature tracking

  • Track in the coarsest scale first.
  • Shift the image according to the displacement.
  • Repeat the tracking in the next scale, and repeat for every scale.
  • Add together the displacement, correcting for the downsampling factor.
  • Two to four scales typically suffice, but this may depend on the original resolution and frame rate
    • textbook is old, and more modern standards may increase requirements
  • Refinement
    • Iteration in the finest scale
    • Use warped/inerpolated version of the next frame
    • Successively improve the estimate
    • Subpixel accuracy
  • Algorithm 11.2
  • Caveat: Drift. Propagation of tracking error.
    • Compensate by feature matching

Projective Reconstruction


  1. Intrinsic
  2. Extrinsic
  3. Non-linear

Note The calibration tutorial focused on non-linear calibration. This is separate from the rest of the system, and unrelated to all the other calibrations and transformations discussed in the module.

Projective Reconstruction (Alg 11.6)

If we have the intrinsic camera matrix, we can do a Euclidean reconstruction straight away.

If not, the known algorithms only provide a projective reconstruction.

  1. Eight-point algorithm to find \(F\)
  2. Recover \([R,T]\) from \(F\).

Euclidean Reconstruction

Instead of doing a complete, stratified reconstruction, it is worth using the last week of the semester to try out the OpenCV API, assuming that the cameras are available for calibration.