My team will design and implement algorithms that can analyze and recognize random shapes.
We are given a shape dataset (in the "shapes_train2018" folder) that consists of 501 synthesized color image with randomly sized and colored object shapes in it.
The %shapes are categorized into three types: square, circle and equilateral triangle. For each image, ground-truth annotations are provided in the "annotations" folder where you can find a binary mask for each shape that occurs in every image with the file name indicating the shape type.
1. For each shape image, determine its background color and label each and every shape blobs according to their color.
2. For each shape blob, implement a border following algorithm to find its outermost contour (i.e. the border pixels) and compare it with OpenCV's "findContours" function.
3. For each shape blob, classify its border pixels (in other words, segment its outermost contour) into three types: borders against the background, borders against another shape blob and borders that are a border of the whole image.
4. (Optional) We may also segment the border according to their convexity, i.e. find out all convex segments and concave segments. This may help us analyze its shape type.
5. For each shape blob, come up with an algorithm that can recognize its shape type (square, circle, or triangle).
We give our own contour-drawing algorithm.
For the shape recognition part, use the annotations to evaluate the performance of your algorithm quantitatively. Calculate the precision ( =TP/(TP+FP)) of your algorithm. Optionally, find another measurement that can show its performance.
Method and Implementation
The analysis of all these pictures:
(1) Different blobs have different color and the background color is also different from blob color.
(2) For each backgroung and blob in every image, it can have slightly different RGB value even they look the same.
(3) The contour color between blob and background can be different from that of blob and background.
(4) The total number of image are quite big (500 images).
So, We chose the CNN network to build a system which can predict a blob's shape due to large number of images. However, We need to preproccess the images' pixels first in order to improve our CNN's accuracy of prediction. For (2), we decided to set a threshold equal to 5. For 2 pixels, if the differences of R, G and B of them are all less than five, the two colors are considered equal.
Firstly, we chose Python3 as the programming language.
The example of original image:
We seperate this task into 5 steps:
1. Using our own contour-drawing algorithm to draw the contours of every blob. The contour was painted as black color while others are all write color. We scan the images, if the difference of 2 pixels is larger than threshold=5 (R, G, B all greater than 5), then we will draw a circle on that pixel. All of the little circle can become contours.
After 1 step, we got the contour image:
Due to the features of images, we found that direcly using fildContour function in OpenCV cannot get contours sucessfully. (More specifically details are shown in "Result" part.)
2. Flooding the background with black color(RGB(0, 0, 0)) using floodFill function in OpenCV in Python3.
Before 2 step, we draw a border, whose width is 2, around the contour image using RGB(0, 0, 0) to gurantee filling background successfully.
After 2 step, we got this image:
3. Using the lab2_2 code to label defferent label of different blobs.
Before doing that:
We slightly blur the images of blob.
(1) Then using erode function to try to drop all noise points.
(2) And We enlarge image 4 times before using dilate function because we found that enlarge the image before dilation can keep the shape of blobs as soon as possible
(3) dilating the image, strengthening the blob shape.
After 3 step, we got this image:
4. We pick differents color of images from step 3, seperate every blob by detecting defferent colors. (Changing the image to 128 * 128 size first)
(The reason why we do not directly do the 4th step first is that we find that a blob and background color can have slightly different RGB value even they look the same.
After 4 step, we got the separeated 2 blob images:
So we need to drop that kind of images (We may as well call it: ususeful images). And then put all useful blob images to train and test CNN.
And the contour color of blob can be different from blob and background, so We decided to deal with all these images using 1 to 3 steps.)
5. We then get all the separetely blobs of each image. Then We train a CNN using Tensorflow to detect the blob of images.
Before step 5, we need to manually label all the blob. MOre specific detals of labels are in the "Experiment" part.
Initially, we found that the accuracy of CNN is around from 30% to 50%, which is quite bad. So we decided to add the floors of network to improve the accuracy of prediction. We also set the kernel size as 3*3 to precise the training process.
The final structure of CNN we design is:
1. We train our CNN with 1800 blob images from step1 to step4. We then pick the rest 114 blob images as test data of CNN.
2. The iteration time of CNN is 10000 times, at more than 3000 times the CNN predicting result becones convergent.
3. The accuracy of our CNN we used: total number of right prediction / total number of prediction.
4. We calculate the CNN getting prediction result of test images as our running time.
5.When trainging and testing our CNN, we define the labels of bolb shape:
which means that we let CNN to do the three classification task.
The specific processes and results of getting blobs are shown on the "Method and Implementation" part.
The comparison of finding Contours function
|Num of img||OpenCV findContours||Our own method|
You may notice that our own methods of drawing contours have no outer boundary. Do not worry about it. After that, we draw a border, whose width is 2, around the contour image using RGB(0, 0, 0) to gurantee filling background successfully.
Here We will show more of the results of drawing contour algorithms of ourselves:
|Num of img||Source Image||Result Image|
the accuracy of our CNN using train images of blobs is finally reach 100%.
the accuracy of our CNN using test images of blobs is finally reach approximately 98%.
the precision of 3 classes is:
The training time of CNN is 4 hours.
The strength of our algorithm:
(1) High accuracy of prediction and precision.
(2) The structure of CNN network is quite simple.
(3) The training and convergence speed of CNN are quite fast.
(4) The speed of pretreatment algorithms are quite fast. They can take all of the pretreatment jobs less than 10 minutes.
(5) Pretreatment algorithms can quite precisely pick all the blobs in an image.
We expect we can get the numbers of square, circle and triangle and know these locations. After pretreatment work we successfully pick each blob, knowing each location; Using CNN, we sucessfully get the numbers of square, circle and triangle.
So, our algorithm is quite successful.
- In the future, we can gather all our code to a big blob-detection system: we give the system the original colored images containing different blobs. Moreover, we can let our algorithms detect more kinds of blobs(such as diamond, oval, etc.).
We give some useful algorithms to detect some blobs in images using some ML and OpenCV techniques.
Credits and Bibliography
(1) Qitong Wang took charge of all of the image pretreatment work. (Step1 to Step4 on the "Method and Implementation" part)
(2) Kaihong Wang took charge of buliding and trainging CNN networks.
(3) Yuankai He took charge of adding labels for all image data and programming implement code.
(4) Qitong, Kaihong and Yuankai often discuss this assignment together and exchange opinions on data preprocessing and neural network training.