Assignment Title

CS 585 HW 3
Yicheng Li
Zilin Zhang
Fangya Xu
Feb 2020

Problem 1

Problem Definition

  1. Implement a connected component labeling algorithm and apply it to the data below. Show your results by displaying the detected components/objects with different colors. The recursive connected component labeling algorithm sometimes does not work properly if the image regions are large. See below for an alternate variant of this algorithm. If you have trouble implementing connected component labeling, you may use an OpenCV library function, but you will be assessed a small 5 point penalty.
  2. If your output contains a large number of components, apply a technique to reduce the number of components, e.g., filtering by component size, or erosion.
  3. Implement the boundary following algorithm and apply it for the relevant regions/objects.
  4. For each relevant region/object, compute the area, orientation, and circularity (Emin/Emax). Also, identify and count the boundary pixels of each region, and compute compactness, the ratio of the area to the perimeter.
  5. Implement a skeleton finding algorithm. Apply it to the relevant regions/objects.

Method and Implementation

  1. Connedt component labeling
    1. Flood filling algorithm
    2. Start from the top-left pixel of the figure, scanning through the whole figure until meet the first non-zero pixel. Then recursively find all neighbours and neighbours' neighbour until we have labeled the whole region. Then scan for next unlabelled non-zero pixel.

    3. Opening
    4. To get rid of all noise in the figure, we use opening to keep only large components in the figure. First we do erosion then dilation so small components would be removed and large ones would be keeped.

  2. Draw boundary
    1. Boundary following algorithm
    2. Start from the top-left pixel of the figure, scanning through the whole figure until we fing the first non-zero pixel. Then search its 8-neighbour pixels clockwise to find the next boundary point.

  3. Object area, orientation, circularity
  4. For area, orientatio, circularity, compactness computing. We use opencv funtions to get moments and use formula to compute the result.

  5. Skeleton
  6. We keep eroding the image when situation fit our rules until the figure's skeleton is found.


When finding components of the figure, we try flood filling and sequential labelling. We find there is little different in time costing between this two and huge space costing so we choose last one.

And when drawing boundary, firstly is hard to implement boundary following to different region in the figure. So then we use labelling when we do boundary following to fix the problem.






Discussion and Conclusion

Discuss your method and results:

We implement our labeling algorithm pretty well on our figure. and also find the boundary and skeleton of each region. Lastly, we calculate area, circularity and compactness of our object. When we first implement our algorithm on multiple object figures, we do meet some problems since at first we can't detect multiple regions. Finally we do labelling when we implement algorithms and successfully detect all.

Problem 2

Piano dataset

Problem Definition

For this problem, we get a series of images as input. In these images, there's a pianist playing on the piano. We need to identify the 2 hands of the pianist and mark it on the images. We also need to outout the images with marks as a video.

Method and Implementation

For the first dataset, we need to identify the hands of the pianist and output the results as a video output. To achiece, this, we first identified the area of interests that excluded the piano and the background. We first compute the average frame from all images and then substract the average frame from each frame. Since the piano and the background are stationary, the substratcion only left us the moving part of the pianist. We used the absolute thresholding function from opencv to obtain the desired mask and intersect the mask with the original frames with bitwise_and function from opencv to achive the background eliminating. After we found the region of interets, we applied skin detection to the frames to further seperare the hands from the rest.

After skin detection, we applied some morphology methods to help get rid of the noises around and made it easier to perform the component labeling later. We used the opening function from opencv.

Then we label the connected components of the images using connectedComponentsWithStats function from opencv. After getting a bunch of components, we filter through them with size and centroid as parameter and what's left are just the 2 bolbs that represent hands. This is exactly what we wanted. We stored the coordinates of these 2 components and draw rectangles on the original frames with these coordinates using the rectangle function from opencv.


For the problem, to achieve the best result, we tried on different values for the skin detection upper and lower bounds to achiece the best result for skin detection. Also, we tried different boundary values for filtering out the components that represent hands.


Some screeshots of the output video:

Discussion and Conclusion

For piano dataset, we found that becasue of the colors in the images are quite similar, skin detection along can't seperate the person from the rest successfully. For example, white keyboards of the piano can be identitied as skin easily. So more mophology method and thresholding are needed in this case. Specifically we used opening because it would get rid of the small parts nexted to the main component. Bluring and thresholding are also helpful in this case because they can help get rid of the parts that are less likely to be skin. By doing this project I realized that morphology methods are very hepful when dealing with complicated images like these where colors are similar and difficult to seperate. Background substraction is also very helpful in most computer vision problems in that it can remove the stationary part of the images.

Bat dataset

Problem Definition

Use characteristics of the connected components to determine if the bats have their wings spread or folded in the gray scale sequence of bats in flight.

Method and Implementation

1. Apply histogram equalization preprocessing on images.

2. Set a threshold (T=210) to segment bat from the image and make the image into binary image.

3. Use opening algorithm to elilminate small noises.

4. Find contours of each components in the image.

5. Compute circularity for each contours to determine if the bats have their wings spread or folded.


We experiment several times and find that 0.4 is the perfect threshold for determining if the bats have their wings spread or folded.



Background substraction is again used in this problem and is proven to be a very useful technique when dealing with stationary background. Also we chosed to use the gray picture for this problem. If we had more time, we would probably try the false color dataset because it would be more interesting with the hearts of the bats identified.


People dataset

Problem Definition

Develop an algorithm to count the number of people in each frame.

Method and Implementation

1. According to the pixel difference between each frame, we set the threshold value for the difference and filter the pixel points larger than the threshold value, then we get a binary image showing people who are moving.

2. We get the connected components in each image.

3. We use dilation algorithm to expand the connected components.

4. Use a rectangular box to locate the connected components and count the sum of them.


We experiment several times and find that the perfect size of kernel is 8*5.


Discussion and Conclusion

For this problem, intead of taking the average frame and substract it from all inages, we compare each frame and only keep the parts where there are changes to achiece background elimination. And it turns out to be just as useful. We also used a lot of morphology methods in order to find the appropriate components. We used dilation to make the seperate parts connect. In order to find the right kernel for dilation, we tried different sizes and chosed the one with the best result.