We use BFS instead of DFS to traverse connected pixels. Not only can we avoid the stack overflow issue but also logging down the traversed nodes with ease on the queue.
If the area of component we find is too low, more specifically, lower than a threshold we set, the component will be discarded.
We implemented Moore-Neighbor Tracing algorithm to find the border of a connected component. The upper-left point will be found first in "right" direction, and the pointer moves clockwise in the immediate outer pixel. The algorithm ends when the pointer returns the starting point in the right.
We used erosion on cross kernel to find the remaining part of the image.
First we used background detection with createBackgroundSubtractorMOG2 to try to remove background.
Then we used skin detection to detect skin. The color space was transformed to YCrCb, and by using absolute thresholding it became a binary picture.
With some opening and closing to reduce noise and counting the size of connected components, the only obstacle was the pianist's hair. To solve this issue we calculated the moment of the connected component and find the one that "looks like" a hand.
This however wasn't the best solution though. We tried to adapt averaging on the image to reduce the size of piano, but it was too dark to get a good result.
2.2 Adaptive threshold: using adaptive can help us distinguish two bats that are near to each other. But if two bats overlap, the result is not good still.
We computed the difference between the first frame and every other frame, and accumulate the sum of every pixel on all frames. (Assuming 1 for "black" and 0 for "white") to find the background and pedestrian's locations on frame 1. Applying this mask with xor operation we can get a clean moving figures on every frame.
Then we tried different kernels to deal with persons close to each other. By calculating the width and height on every connected component we filter those too small nor too "fat".
The results was very promising, yet there may be still problems near the light as it obstructs the person. Furthermore, there may be still some issues on balancing between separating two persons and integrating different parts of a person.
We had some good results on the dataset. However, there are still many aspects that can be improved, such as removing the piano, and removing the obstacles.