# Assignment Title

CS 585 HW 2
Yicheng Li
Zilin Zhang
Fangya Xu
Date Feb. 2020

## Problem Definition

Design and implement algorithms that recognize hand shapes (such as making a fist, thumbs up, thumbs down, pointing with an index finger etc.) or gestures (such as waving with one or both hands, swinging, drawing something in the air etc.) and create a graphical display that responds to the recognition of the hand shapes or gestures.

## Method and Implementation

1. Read a new frame from video stream and detect skin color of this frame.

2. Firstly get greyscale of this picture and then we make it to be a binary image.

3. Label the connected fields of the template picture by seed filling algorithm and then cut the largest one as the template.

4. Scan the original image through a window with template size(once in two seconds).

5. Sum the amount of different pixels between origin image and the template image and calculate the error rate.

6. Use a rectangular box to locate the hand shape with the smallest error rate at most 0.2 in the original image.

## Results

### Results

 Hand Shapes Result Image Thumbsup Yeah Handshake OK

## Discussion

• What are the strengths and weaknesses of your method?
• Since we use template matching to recognize gesture, an obvious strengths is that the error rate of our result is really low. In only few situations like low light environment will the template matching algorithm performs bedly. However, there is still some weakness of it and that is the time costing. The time costing is so high that we can only search the gesture one times two seconds because if we apply it every single seconds the window won't be able to show the video. So we decrease the frequency of it.
• How the graphics respond to different hand shapes and/or gestures
• Since we are using template matching algorithm, when the search window meet different gestures or backgrounds the amount of matching pixels will be pretty low. In this case, the error rate would definitely bigger than the threshold so this area won't be selected.

## Conclusions

• Firstly, we find our algorithm would be greatly affected by different aspects but some of them make us really confusing.For example, the light condition and the difference between cameras. So it really raise our confusion that how can we make algorithms that can be applied in different cameras since we can't know the condision of cameras and environments before we really see it.
• Secondly, we find it really important that how we choose our threshold. If the threshold is too low, the algorithm would not be able to regonize gestures since the environment is constantly changing. And if the threshold is too high, irrelevant backgrounds like a head may be recognize as a fist.