Problem Definition
Using Template Matching and other basic computer vision techiniques to build a program using OpenCV that can recongnize 4 different hand shapes in real-time.
Method and Implementation
Steps:
- To make image invariant to background noise, used subtraction making first frame as background and subsequently subtracting next incoming frames
- Converted image in gray scale and after that ran experiments on different range values to detect the pixel value of skin color
- Used thresholding to binarize the image for detected skin pixels
- Same sets of transformations are incorporated to the template image as well
- Scaled the template image through a range of scaling factors and compared it with the source frame
- The template with the highest NCC with the scene is reported
- minMaxLoc: to find the maximum value (and its index)of a matrix
- threshold: to binarize an image with a given threshold
- rectangle, putText: to draw bounding box over the detected gesture
- matchTemplate: to calculate the cross-correlation of two images
- As we are substracting through different scaling factors for matching the template,we can extract the area of the palm and scale the template accordingly to make the program run much faster.
- As per our experiments, we were able to obtain better accuracy when the background was kept static.
- Similar gestures like peace and palm were mistakenly recognized sometimes.
Following OpenCV methods were used:
Experiments
These are the four different templates used:
Name | Template |
---|---|
Palm | ![]() |
Gun | ![]() |
L | ![]() |
Peace | ![]() |
Results
Here are several recognition results of all four hand shapes:
Hand Shape Name | Result |
---|---|
Palm | ![]() |
Gun | ![]() |
L | ![]() |
Peace | ![]() |
The following confusion matrix was obtained by changing the hand gestures
Hand Shape | Palm | Gun | L | Peace |
---|---|---|---|---|
Palm | 8 | 0 | 0 | 2 |
Gun | 0 | 5 | 0 | 0 |
L | 0 | 2 | 5 | 1 |
Peace | 2 | 0 | 1 | 5 |