Problem Definition
Using Template Matching and other basic computer vision techiniques to build a program using OpenCV that can recongnize 4 different hand shapes in real-time.
Method and Implementation
Steps:
- Using background subtraction to make the program invariant to background noise
- Detecting skin color by using a range of RGB values obtained through experiments
- Using thresholding to binarize the image for detected skin pixels
- Applied the same set of transformations to the template image
- A range of scaling factors is used to scale the set of template images and it's compared with the source frame
- The template with the highest NCC with the scene is reported
- minMaxLoc: to find the maximum value (and its index)of a matrix
- threshold: to binarize an image with a given threshold
- rectangle, putText: to draw bounding box over the detected gesture
- matchTemplate: to calculate the cross-correlation of two images
- As per our experiments, better accuracy was obtained when the background was plain.
- Similar gestures like peace and palm were mistakenly recognized sometimes.
- Rather than looping through different scaling factors for matching the template, we can extract the area of the palm and scale the template accordingly to make the program run much faster.
Following OpenCV methods were used:
Experiments
These are the four different templates used:
Name | Template |
---|---|
Palm | ![]() |
Gun | ![]() |
L | ![]() |
Peace | ![]() |
Results
Here are several recognition results of all four hand shapes:
Hand Shape Name | Result |
---|---|
Palm | ![]() |
Gun | ![]() |
L | ![]() |
Peace | ![]() |
The following confusion matrix was obtained by changing the hand gestures
Hand Shape | Palm | Gun | L | Peace |
---|---|---|---|---|
Palm | 6 | 0 | 0 | 1 |
Gun | 0 | 6 | 0 | 0 |
L | 0 | 2 | 5 | 2 |
Peace | 2 | 0 | 1 | 5 |