Assignment Title

CS 585 HW 4
Kaihong Wang
Teammate names: Qitong Wang, Yuankai He
Date Oct 31, 2018

Problem Definition

Part 1: Segmentation: For the first part, we are asked to localize the position of both hands of a pianist in a sequence of video frame. The movement in the video frame basically includes the movement of hands, the movement of the rest part of body of the pianist and a little bit change in the background. However the problem is that portions of the hands are sometimes in deep shadow which make hands hard to detected by skin detecting algorithm. In addition, hands of pianist in the video sometimes overlap with each other, which makes the segmentation of both hands harder. We should also note that hands of the pianist are not the only moving part in frame. Part 2: Tracking
The goal of this part of the programming assignment is for us to learn more about the practical issues that arise when designing a tracking system. We are asked to track moving objects in video sequences, i.e., identifying the same object from frame to frame:

We may consider two frames at a time (or, more ambitiously, use multiple hypothesis tracking (MHT) with more than two frames).
We may use a greedy, suboptimal bipartite matching algorithm (or, more ambitiously, implement an optimal data association algorithm).
To estimate the state of each tracked object, We may use an alpha-beta filter (or, more ambitiously, a Kalman filter).

(1) The bat dataset shows bats in flight, where the bats appear bright against a dark sky. We included both grayscale and false-color images from this thermal image sequence. We do not need to use our segmentation/detection. We may use any of your code from the previous assignments and any library functions you wish. Display the results of your tracking algorithm on top of the original images. Use different colors to show that you successfully maintain track identity. Draw lines to show the history of the flight trajectories.
(2)The purpose of this assignment is to track cells, including the division of cells.

Method and Implementation

Part 1: Segmentation of hands
We use differencing algorithm of adjacent frames to narrow the range of possible pixels at first. Then we use a stricter skin detecting algorithm to detect pixels having skin color so we can get several smaller patches of pixels on the bright side of the hands, hair and pants of the pianist. Since we use stricter skin detecting algorithm, we can assume that the hands of the pianist will have largest number of selected pixels. So after we have these patches, we can find the largest two patches in a restricted location in the picture which we can observe from original frames given the fact that the position of the camera and background never change. Then we use a generative model to iteratively search 8 neighbors of pixels in these two patches using a looser skin detecting algorithm to find pixels of skin in shadow.

Part 2: Tracking
(1) Bat Tracking:
All the steps of tracking bats:
Step1: Getting all the bat locations from Location txt, and rank all the text according to time sequences.
Step2: Storing all the bat locations at a 2D-dimension numpy array.
Step3: Getting ther next frame's bat locations and predicting the bat location using alpha-beta filter
Step4: Drawing the prediction results frame to frame. And using prediction results of all bats to get their corresponding next frame's location and update all bats' locations
Step5: Checking whether we need to add new bats which are arised from latter frame or not, if we found that at next frame, some bats are not arised in the current frame, then we add them.
Step6: If we found that a bat's movement distance is larger than threshold, then indicating that the bat disappeared, so we need to delete it from the array which is used to store the locations of all bats.

(2) Cell Tracking:
Since the white halo around the cell is not a part of cell, but I used it to identify the border of the cell. This way, with proper thresholding, I can have a binary image and find the contours of the cells. 1. Dilate the image so that the contours will be connected. 2. Simple thresholding. 3. Find Contour of first image 0001.jgp 4. Find contour whose area is larger than 3000, store all separate contour as different cells. 5. Cycle through the "Normalized" folder, dilate, threshold, and find contour on those image. 6. Filter out the larger contours, compare the centroid of the contours with that of the stored cells. If they are with in a certain distance, update the stored cell's contour 7. For each image, draw the contour from the stored cell list. 8. If the cell's contour has not been updated, do not draw it.

Experiments

Part 1: Segmentation of hands

In this system, We used python and opencv to process image and implement the localization algorithm.
We display several interval steps to verify the result of our algorithm.

After using differencing

After using skin detection

After using generative algorithm

Result

We use the naked eye to visually observe whether the system runs smoothly and whether the algorithm can localize the position of hands successfully.

Part 2: Tracking
(1) Bat Tracking:
In the bat tracking experiment, we use Python3 to finish the homework. For each bat, we draw its tracking line using different color. And showing their tracking lines on a frame.
(2) Cell Tracking:

Results

Part 1: Segmentation of hands

Results of localization of hands in frames are shown below:

Frame	Result
Frame 0
Frame 4
Frame 8
Frame 12
Frame 16
Frame 18

From results above we can observe that the algorithm can recognize both hands in all frames and localize approximate position of both hands with a relatively high precision.
Combining different approaches such as skin detection, frame differencing and generative algorithm, our algorithm has much better performance due to the elimination of irrelevant feature in picture and the high efficiency of crucial infomation extraction.
Part 2: Tracking
(1) Bat Tracking:

(2) Cell Tracking:

The results are pleasing with some minor errors.

Discussion

Segmentation of hands:

Our algorithm use frame differencing to eliminate most impossible pixels in the picture, which imporoves the time performance in sreaching phase later.
Our algorithm use a stricter skin detecting algorithm and then looser one to find the position of both hands of pianist as well as eliminate irrelevent pixels in the picture so that we can localize the precise position of hands covered in shadow.
However, our algorithm might be a little bit slow in real-time scenario because the generative algorithm consumes most time iteratively search surrounding eligible pixels to generate precise outline of hands using RGB and spacial information.

Bat Tracking

The tracking result is shown on the Youtube video link
In most cases, alpha-beta filter can get corrent prediction results for each bats and can track every new bat accurately. But we found that there is a small probability that the alpha-beta can get the wrong prediction result when two bats cross at the same time, which is a very challenging situation.
For the last 2 steps of tracking bats:
Step5: Checking whether we need to add new bats which are arised from latter frame or not, if we found that at next frame, some bats are not arised in the current frame, then we add them.
Step6: If we found that a bat's movement distance is larger than threshold, then indicating that the bat disappeared, so we need to delete it from the array which is used to store the locations of all bats.
When some bats occlude together and cross at different time, we found that the tracking system can approximately get correct prediction answers independently.
When the tracking system get wrong prediction results, if the difference between the correct location and the wrong location is bigger than the threshold, then the system will delete the bats according to "Step6", fortunately, accoding to "Step5", tracking system will pick the deleted (but still at the frame) bat repeatly and get its following prediction result.
In our tracking system, alpha-beta filter can get consider the velocities of every bat, which means that it won't juct consider the location to get prediction results, which can pretty improve the prediction results.
In the future work, we can try some other filters (such as Kalman filter) and try top improve the accuracy of predictions.

Cell Tracking:
1. Identify a challenging situation where your tracking succeeds, and a situation where you tracking fails. A: It is successful at cell division and cells exiting, but it is unsuccessful when the cells reenter. It also fails when it detects a long leg of the cell, and since the contours are not connected, it treats the long leg as a new cell.
2. How do you decide to begin new tracks and terminate old tracks as the objects enter and leave the field of view? A: I decide to begin new tracks when there is a large enough contour that is not recorded in the stored cell list. I end the tracks when the contour of a cell is no longer being updated.
3. What happens with your algorithm when objects touch and occlude each other, and how could you handle this so you do not break track? A: The algorithem treats them as 1 cell. But as they break apart, based on their centroid and the new centroid, it will be separated into 2 cells, carrying its original name.
4. What happens when there are spurious detections that do not connect with other measurements in subsequent frames? A: Nothing happens, the previous cells will not be tracked, so the count will continue.
5. What are the advantages and drawbacks of different kinematic models: Do you need to model the velocity of the objects, or is it sufficient to just consider the distances between the objects in subsequent frames? A: The advantages of modeling velocity would not be helpful in this case because the cells are altering shape each frame. But just considering the distances of the objects between subsequent frames is not enough, because the cells could have split.

Conclusions

In this experiment, we build a computer vision system which can successfully localize hands of pianist in a series of frames.
We used alpha-beta filter to track bats and cells.
For cell tracking, We think the hardest part of this problem is the segementation part and the tracking of cell division. I tried using adaptive thresholding, but the result is very unsatisfying.

Credits and Bibliography

(1) Kaihong Wang mainly took charge of segmentation of pianoist's hands.
(2) Qitong Wang mainly took charge of tracking bats.
(3) Yuankai He mainly took charge of tracking cells.
(4) Qitong, Kaihong and Yuankai often discuss this assignment together and exchange opinions on segmentation and tracking. institution).

After using differencing
After using skin detection
After using generative algorithm
Result