Image and Video Computing

CS 585 HW 2
Aaron Jacob Varghese
12th February

Problem Definition

The main focus of this homework is to design and implement algorithms that recognize hand shapes or gestures, and create a graphical display that responds to the recognition of the hand shapes or gestures.

Method and Implementation

The first step is to gather templates for hand shapes we would like to detect. It can be done either by clicking pictures or finding binary images of those hand shapes online. We can recognize the hand shapes using template matching in three steps

  1. Read Frames

    Read and display the video frames from a webcam. We capture frames from the webcam using opencv.
  2. Skin Detection

    Used the input RGB frame and estimated values of skin color to function as the benchmark for thresholding input frame according to skin color. Obtained ellipsoid structuring kernel and applied morphological operations like erosion and dilation to it followed by Gaussian Blur. This gives us the skinmask.The skinmask will be used in future template matching operations to detect the hand shape in an input frame. In order to threshold the input video frames, and carry out precise segmentation, trackbars were implemented to get csv values according to the skin color.
  3. Template Matching

    The stored templates are then matched with our binary frames using the opencv matchTemplate function.
    void matchTemplate(InputArray image, InputArray templ, OutputArray result, int method)

    image – Image where the search is running. It must be 8-bit or 32-bit floating-point.
    templ – Searched template. It must be not greater than the source image and have the same data type.
    result – Map of comparison results. It must be single-channel 32-bit floating-point. If image is W \times H and templ is w \times h , then result is (W-w+1) \times (H-h+1).
    method – Parameter specifying the comparison method (see below).

    The function slides through image , compares the overlapped patches of size w \times h against templ using the specified method and stores the comparison results in result . Here are the formulae for the available comparison methods ( I denotes image, T template, R result ). The summation is done over template and/or the image patch: x' = 0...w-1, y' = 0...h-1.

    The method we used is CV_TM_CCOEFF_NORMED
  4. Templates

    The following are the templates used in this assignment
    1. Open Palm
    2. Yo
    3. Fist
    4. Peace


Hand ShapePalmFistYoPeace


Template matching is simple yet very effective. We implemented the assignment in python as well as C++ only to realise that python works really slow especially for a time consuming algotihm like template matching. The accuracy of skin detection will vary from person to person and the environment, which is the only downfall of this process .With template matching we can go further and detect hand gestures by templating motion energy or frame difference for the required gesture.


We have successfully implemented hand shape recognition for the given templates and developed confidence to work with complex gestures and shapes.

Credits and Bibliography

Teammate - Shivam Satwah
OpenCV Template Matching