Project Tracker
Briefing Multiscale Detection
Exercise
- Implement a prototype able to track one simple object in a video scene.
- Your program should be able to tell (print out) the position of the object; i.e. it is not sufficient to visualise detected features as many tutorials do. Imagine you want to use the information to control a robot to pick up the object, for instance.
- You have full freedom to do this as you please, but bear in mind that we only ask for a prototype.
- The goal is to understand a handful of constituent techniques, not to make a full solution for production.
- It is a good idea to show your prototype in the exam, but we will be asking you how it works and/or how you would make it work in practice. Again the goal is to understand it, not to make it as such.
- There is a session of self study to give you time to complete the project.
- There is a separate page with some Project Tips to consult when you need it.
Sample approach
Step 1. Dataset.
- You can set up your own scene, recording your own video, with a characteristic, brightly coloured object moving through the scene. E.g. a bright, red ball rolling on the floor.
- Start with two consecutive frames from the video.
Step 2. Feature Detector.
- Start with the feature detector. Make sure that it works. You may use a library function.
- Can you use the feature detector to detect your particular object in a still image?
- Visualise the detected object, by drawing a frame around it in the image.
Step 3. Tracker
Introduce tracking only when you have a working prototype for still images.
- Find the corners (use the Harris detector)
- Calculate the spatial derivatives \(I_x,I_y\) using the Sobel filter.
- Calculate the temporal derivative \(I_t\) using the first order approximation of the difference between the next and the current frame.
- Calculate the element-wise products \(I_xI_t\), \(I_yI_t\), \(I_x^2\), \(I_xI_y\), \(I_y^2\), which we will use later.
- For each corner \(\mathbf{x}\),
- calculate the \(G\) matrix and \(b\) vector
- solve \(\mathbf{u} = -G^{-1}b\). Note that this is a matrix product.
- use the
numpy.linalg
library to invert \(G\).
- Plot the features \(\mathbf{x}\) and the vectors \(\mathbf{u}\) in the image.
- for plotting the vectors, this is the top of my google hits and looks useful
- Note that the vector \(\mathbf{u}\) tells us the speed and direction of the point, and thus gives crucial information for analysing the behaviour of objects in a scene. Many applications will use this information in its own right, and not just use it to recover the same point in the next frame.
- Calculate the feature points from the next frame and plot them in the current frame (together with the vectors and points above).
- Do the new positions fit with the previous positions and motion vectors?
Step 4. Colour Detection (optional)
Step 5. Multiscale Tracking (optional)
Step 6. Continuous Tracking (optional)
If everything works fine from one frame to the next, you can try to repeat the operation to track feature points throughout a video.
Step 7. Feature Descriptors (optional)
Feature Descriptors (e.g. SIFT) allow us to match features between frames or images, even when tracking is not possible. Some applications use feature descriptors instead of tracking, because tracking is not feasible. Other applications may want to use continuous tracking, but because tracking occasionally fails, we may need feature descriptors to recover from errors.
- You should probably first test feature descriptors one widely different frames, e.g. movements of ten pixels or so.
- If this works, you can add feature descriptors to your tracking system to validate that the correct object is being tracked, e.g. with a check every second.
Debrief
- How did you fare?
- Exam planning.
- Feature Tracking Algorithm
- Partial Python Demo TBC
- uses test images frame27.jpeg and frame28.jpeg
- You may also want to try the Contours API of OpenCV’s