Gesture recognition

CS 585 HW 2
Yaan Tzi Kan
Vivian Gunawan
Beatrice Tanaga

Problem Definition

Design and implement algorithms that recognize hand shapes (such as making a fist, thumbs up, thumbs down, pointing with an index finger etc.) or gestures (such as waving with one or both hands, swinging, drawing something in the air etc.) and create a graphical display that responds to the recognition of the hand shapes or gestures. Your algorithm should detect at least four different hand shapes or gestures. And you must use skin-color detection and binary image analysis (e.g. centroids, orientation, etc.) to distinguish hand shapes or gestures.

Method and Implementation


  1. We took pictures of our hands in different gestures
  2. We use skin color detection algorithm given in lab 3 to get binarised images of the gestures. (we found that lowering the red threshold to 70 as opposed to 90 gave a cleaner image with less artifacts)

  1. From our video capture we first convert the frame to HSV (we found that HSV filtering was quicker and less computationally intensive compared to the original skin detection algorithm in RGB) and filter out skin color and binarize the image
  2. We then used horizontal and vertical projections of the binary frame to get a bounding box for the hand.
  3. We then took a crop of the bounding box and used the template matching with all the templates made
  4. Take the highest correlation of all the templates to the crop
  5. add in the bounding box and label

In our


Describe your experiments, including the number of tests that you performed, and the relevant parameter values.

Define your evaluation metrics, e.g., detection rates, accuracy, running time.


Confusion Matrix

Predicted/TrueFistOpen HandPeaceOkay
Open Hand1001


Trial Template Image Result Image
Gesture1 : fist
Gesture2 : okay
Gesture3 : peace
Gesture4 : open palm


Discuss your method and results: I feel like our results were generally okay. However, the algorithm struggled to correctly identify the gesture if the hand position was slightly rotated. Despite not flipping our templates we had pretty good success using the other hand (left/right) we had surprisingly okay results. I feel like our skin detection method might have left abit more unwanted artifacts than I had hoped for but given our experience with OpenCV I think we did alright. Given more time I would have liked to try other methods of fist detection/ getting the contour of the fist and using that for a more accurate bounding box. I would also have liked to try and see if erosion of both the template and the pixels in the bounding box would have made it more accurate for template matching. I would have also liked to see other template matching methods would have made a big difference.


Template matching serves as an good means of getting object/gesture recognition but it is not simple! There are many ways in which Templates can be matched and not all of them are the same I also learnt that there are many ways in which skin color detection can be done between threshold values of RGB and HSV. Template matching can also benefit from better methods such as better skin color detection and better bounding box creation.

Credits and Bibliography date of access :02/12/2020

Vivian Gunawan, Beatrice Tanaga