The problem of this assignment is divided into two parts. First part is segmentation of hands, the second part is tracking two datasets.
- Part 1: Segmentation of hands. In this part, you are given a sequence of video frames in which a person is playing the piano with both hands. Try to develop an algorithm to identify the pianist's hands. Portions of the hands are sometimes in deep shadow, which creates a challenging imaging situation.
- Part 2: Tracking. The goal of this part of the programming assignment is for you to learn more about the practical issues that arise when designing a tracking system. You are asked to track moving objects in video sequences, i.e., identifying the same object from frame to frame.
Method and Implementation
- For part 1:To obtain the segmentation of hands, we can calculate the difference between two frames because hands are the only object move in images. I first turn two images into grayscale and use absdiff() to calculate the difference.
- Then use blur() to smooth the difference and threshold the difference to make it a binary image. After that, use connectedComponentsWithStats() to label connected areas in the difference. Besides the background, I assume that biggest connected area is the pianist's hands. We can obtain the centroids of areas using the previous function. Use the centroid to draw the bounding box. The upper box is the right hand and the lower box is the left hand.
- For Part2 :In bat datasets, we are given centroids of detection at each frame. We are considering two frames at a time.
- For all bats in the current frame, find a distance to the prediction from the previous frame. Predictions are created by the AlphaBetaFilter. We are using a greedy matching method find the least distance to match the current frame object to previous frame object.
- We are setting a tolerance error value if the least distance exceeds the tolerance, the match failed, the object in the current frame consider a new object. If there are any objects in the previous frame didn't get matched, we consider the object left the scene.
I am applying below template to detect hand shapes.
|Success when new objects come in|
|Failed when object move too fast, and objects touch and leave situation|
Bat track video
Results are showed as above.
How do you decide to begin new tracks and terminate old tracks as the objects enter and leave the field of view?
we are operating on the current frame of objects and try to match them with the previous frame prediction. If the object in the current frame doesn't match means it's a new object.
If object in previous frame doesn't match means the object leave the field of view.
What happens with your algorithm when objects touch and occlude each other, and how could you handle this so you do not break track?
when two objects touch, our algorithm will consider one of the objects leaves the field of view, and consider the two objects as one bigger object.
What happens when there are spurious detections that do not connect with other measurements in subsequent frames?
Spurious detections happens like velocity of object is too high, will cause the track break and consider the object as different objects in the subsequent frames.
What are the advantages and drawbacks of different kinematic models: Do you need to model the velocity of the objects, or is it sufficient to just consider the distances between the objects in subsequent frames?
Having a velocity model can help us to predict the position of the object. In the tracking process,
it's important because if we consider greedy matching only the two close objects in two subsequent frames can be different objects. Also, the measurement has a lot of noise.
In conclusion, frame difference may not a good way to segment and track objects. It depends on the brightness and variation of each frames. We can use some other methods to improve the algorithm under different circumstances. We find segmentation and tracking are interesting topics and we can improve our algorithm by solving those challenging situations.
Credits and Bibliography
https://docs.opencv.org/3.3.1/d3/dc0/group__imgproc__shape.html#ga107a78bf7cd25dec05fb4dfc5c9e765f accessed at 10/25/2018
https://docs.opencv.org/2.4/modules/core/doc/operations_on_arrays.html accessed at 10/25/2018
http://answers.opencv.org/question/99614/how-to-address-a-specific-centroid-obtained-from-the-function-connectedcomponentswithstats/ accessed at 10/25/2018
https://en.wikipedia.org/wiki/Alpha_beta_filter accessed at 10/20/2018
lab7 and lab8 solutions
Worked with Kaikang Zhu