Assignment 2 Part 2

CS 585 HW 2
Zezhou Sun
Date: 09/20/2017


Problem Definition

Use template matching algorithm to detect different types of hand shapes. And that should be a real time detection.


Method and Implementation

At first, use skin detection method we used in lab 2 to find out skin part in raw image. Then use black pixel to fulfill all the other background pixels. Apply the same method to hand shape templates. Then we get images only with skin color, all the other pixels are black. Please pay attention here, the new images (include those templates) are still RGB images.
Secondly, resize hand shape templates. At here, we only detect hand shape at certain distance. So its enough for us to use certain size of templates only.
Then apply template matching algorithm to the raw image read from camera using background removed template. For each hand shape, we have a template for it. So we have to run template matching 4 times. Then we will get four result matrixs. Because we used Correlation Coefficient Algorithm here, so the values in result matrix is from -1 to 1. The larger the value is, the better the template matches to that sub image.
Next step is to find out which type the hand shape in raw image belongs to. Take highest value in each result matrix (All four result matrixs are not normalized). Then find out the largest value in these four values. If the largest value is greater than 0.5 (A threshold set up manually) then find out where is the max value point in result matrix, and find out the location of that point in raw image. Draw a rectangle which have that template's size on raw image. If largest value is less then 0.5, program will show that it cannot find a hand shape it knows in raw image.


Experiments

Run program and do different hand shapes in front of the camera.
For each test, last at least 1 seconds then change of keep that hand shape. Repeat test 00 times, record result. If any frame in that test is not recognized correctly, treat it as fault detection.
Build a confusion matrix for this. The size of that matrix is 5 * 5. The column is test's true hand shape, the row is the detected hand shape result. And there is a extra column and exrta row to record not hand shape exist result. Detection rates are frames that this program can handle in one second. Use true positive and false positive to describe accuracy for a hand shape. The true positive is the true positive number of test divide by sum of all number of that shape in test. False positive is the sum of other cells except well detected cell and divide by the all other hand shape's total test number. For hand shape i which is at the i row and col in matrix M, TP = M(i, i)/sumofCol(i), FP = \sum M(i, j)/\sum sumofCol(j) (j != i)
Running time is just the reciprocal of detection rates.

Special Requirement, instruction and explaination:
1. Hand shape in captured raw image should have same size as templates 2. The lighting of that room should be cold light. Warm light will affect detection result. 3. Face showed in camera may affect the result, especially when there is no hand shape in front of camera. 4. Using template matching function in opencv. Self implemented one is too slow to use. 5. Using correlation coefficient function to calculate template matching. 6. If that room is so dark that skin color cannot be detected, it will affect detection result.


Results

Truly Shape 1 Truly Shape 2 Truly Shape 3 Truly Shape 4 Truly Not Exist
Detected Shape 1 12 2 1 1 5
Detected Shape 2 0 10 0 0 0
Detected Shape 3 2 1 13 2 2
Detected Shape 4 1 2 1 12 1
Detected Not Exist 0 0 0 1 7

Shape 1 ROC coord = (0.80, 0.15)
Shape 2 ROC coord = (0.67, 0.00)
Shape 3 ROC coord = (0.87, 0.12)
Shape 4 ROC coord = (0.80, 0.08)
NotExistROC coord = (0.47, 0.02)

Frame per second = 2 (i7-6550U, 2.2GHz)
Estimate Running time 0.5s for a cycle

Results

Trial Real-time Image
Shape 1 - Hand Closed
Shape 2 - Hand Opened
Shape 3 - Thumb Down
Shape 4 - Thumb Up
Report Terminal


Discussion

Discuss your method and results:

This method is generally successful but still with some limitations. I want the program can find my hand with random distance to the camera, and can be more precise in location my hand, even in a bad condition (With warm lightness and lots of object with skin color as background). Also the speed of this method should be improved, current version is too slow.


Conclusions

This method is a generally success method. But failed at hand rotating detection and must have a certain distance to camera.
Combine with other techniques this can be better implemented.


Credits and Bibliography

  • http://docs.opencv.org/2.4/doc/tutorials/imgproc/histograms/template_matching/template_matching.html
  • Lecture 2, template matching
  • Lab 2, video computing introduction.
  • Credit any joint work or discussions with your classmates.