Hand Gesture Recognition: Counting Fingers in Real Time

CS 585 Final Project
Annan Miao
05 May 2020

Problem Definition

The paper "Counting Fingers in Real Time Using Computer-Vision Techniques," 2004. pdf. introduces methods for hand detection and tracking. The goal of this project is to reimplement the approach provided and try get equivalent results as the paper. The algorithm can identify hand and count number of fingers, with a wide array of lighting and background conditions.

Method and Implementation

The algorithm for Finger Counter interface consists of:
1. Segmentation
Here we try 2 methods: skin detection method from class, and background removing method stated in the paper. For skin detection, we use RGB values to represent range of skin colors for detection. For background removing, we use createBackgroundSubtractorMOG2() and other image processing functions.
2. Feature Extraction
After segmentation we get the binary image of a single hand. Then we can find and sketch the contour of the largest connected component in the binary image by findContours() and drawContours() functions.
After we draw the contour, we can create a hull and detect the defects of the hand image by convexHull() and convexityDefects() functions. As shown below, the red line represents a convex hull, and the arrows represent convexity defects. If the angle of a convexity defect is less than 90 degrees, we can consider the defect as fingers.

The paper also provided a method to calculate polar coordinates and determine number of fingers as shown below. The method is still to be implemented and compared with the convexity one.


Below are the outputs using skin detection and background removing segmentation methods.

Below are the binary images and the contours of different gestures:

We can see that the 2 methods give equivalently good outputs. For background removing we apply erosion to clear noises. For skin detection the color detection itself can give reasonable output. As a tradeoff, the skin detection algorithm takes relatively long time to run.


Below is the confusion matrix of the classification of number of fingers.

Confusion Matrix True Class
1 2 3 4 5
1 8 0 0 0 0
2 2 9 0 0 0
3 0 1 8 0 0
4 0 0 2 8 0
5 0 0 0 2 10

The tests were done in my room with ambient natural light, with accuracy = 86%. Compared to the results in the paper stated below, we can see that the reimplemented algorithm achieves equivalent outputs.


The reimplemented method achieves high accuracy for all classes. There are some misclassification because the color of background (in this case bright white wall) may be similar to skin color, and thus will affect the validity of segmentation. If we can choose a background distinct from the color of skin, the system would achieve accuracy close to 100%.
The input images are all real time images of user hand, we may also use dataset of hand segmentations to test the algorithm.