Assignment 1

CS 585 HW 1
Caroline Ferris
January 28, 2020


Problem Definition

This problem asks you to take an image of your face as an input, manipulate it in three ways and then output the processed face. First you are asked to create a grayscale image of your face by converting the color image. Second you are asked to flip your face image horizontally. Third you are asked to manipulate the image in a way that produces an interesting output.


Method and Implementation

The first part of this problem was to create a grayscale image of my face. To do this, I went through the image row by row and changed the RGB (in this case BGR) values of each pixel so the entire image will be gray. To do this, I took the blue, green, and red values of each pixel and averaged them to get the gray value. After doing this, I set the blue, green, and red values equal to the calculated gray (average) value. By doing this throughout the whole image, the resulting image has been successfully converted to grayscale.
The second part of this problem was to flip the image of my face. To do this, I went through the image row by row looking at the pixels and changing the BGR values of each pixel. I went through the original image and set the pixel values of each column on the left equal to the value of the column on the right side of the image. In other words, I took the values of the pixels on the left side and put them on the right and I took the values of the pixels on the right side and put them on the left. By doing this throughout the whole image, the resulting image has been successfully flipped horizontally.
The third part of this problem was to manipulate the image in another way. After some experimenting, I decided to invert the colors of the image of my face. To do this, I went through the image row by row and changed the BGR values so that the values in the new image are equal to 255 minus the values in the original image. By doing this throughout the whole image, the resulting image has been successfully inverted.


Experiments

The first part of this problem did not require much experimentation. It took a couple of attempts in order to determine how to access the pixel values in OpenCV, but once I figured that out, it was easy to calculate the adjusted gray value of the new pixels
The second part of this problem involved some experimentation before actually writing the code. I first drew out two 3x3 grids where each box represented a pixel of an image. This allowed me to visualize on a smaller scale the goal of the problem. I then went through box by box and noted the original coordinates (like row 0, col 0) and the adjusted coordinates (row 0, col 2). When I first tried to implement it in the code, I found that the resulting image was two images of the left half mirrored against each other. I discovered that instead of referencing the pixels of the original image, I was going through and reflecting the pixels of the image as I was editing it, therefore causing the mirror effect. With only a brief change of the code, the problem was fixed and the result was correct.
The third part of the problem involved some experimentation, not necessarily to solve the problem, but to determine what manipulation I thought was most interesting. The first idea I tried was to split the image into three sections and make each section tinted a different color. The top third of the image was tinted red, the middle third was tinted green, and the last third was tinted blue. Another idea I tried was to exchange the values of the red, green, and blue in each pixel. This tinted the image a variety of colors depending on how I swapped the values. Some such colors were purple, green, pink, and light blue. The last idea I tried was to invert the colors of the image, which I thought looked the most interesting and required a little bit more creativity.


Results

The result of the first part was a grayscale image of my face.
The result of the second part was a horizontally flipped image of my face.
The result of the third part was an inverted color image of my face.

Results

Part Source Image Result Image
part 1
part 2
part 3


Discussion

I believe that overall, my methods are successful. Through this method, the problems were all solved in quadratic runtime. The method of going through the image pixel by pixel worked well to solve all the problems in a straightforward way that was easy to visualize. Another thing about this method is that it is easy to adapt the code between each step of the problem.


Conclusions

I think that I discovered an overall successful way to solve the problems presented in the homework. I conclude that looking pixel by pixel through an image and manipulating the image that way, you can accomplish what you want to solve.


Assignment 2

CS 585 HW 2
Caroline Ferris
February 12, 2020


Problem Definition

This problem asks you to design and implement an algorithm that recognizes hand shapes or gestures through a webcam. You are also asked to create a graphical display that responds to the shapes or gestures of the hands the algorithm detects or recognizes.


Method and Implementation

There were multiple steps to solving the problem given. The first of which was to access the computer camera and take images to manipulate. This solution we discussed in lab. In lab we also discussed how to take the given images and alter them to identify skin color and separate the person using the computer from other things in the background. Using this information, I took the images from the web camera and applied the skin detection function to turn the original image into a binary image. After this, I tried multiple techniques to get the computer to recognize certain shapes (which I will go over more in the experiments section). I decided to use contours and related contour operations, which are part of the OpenCV libraries. Specifically, I looked at the contours around the largest skin colored portions of the current image and traced around that. I also placed a couple of boundary rectangles around the image reflecting the angle the hand was tilting and a rectangle spanning the maximum range of where the skin color is being detected. I also included points meant to be where the fingertips of the hand are located. I also made it so that when the hand or fingers move, the boundary rectangles and contour trace follow the hand or fingers as they move and adjust to the new shapes or gestures.


Experiments

This problem required much experimentation. As stated in the homework assignment, there were multiple techniques that could be used to accomplish the goal of hand detection. The first option I tried was template matching. I took pictures of my hand in various poses and attempted to write code that would match the template images with the shapes my hand was making on the web camera. When I tried this, I encountered many struggles including the fact that the template is unable to rotate, meaning that if your hand does not line up, then there is no match. I encountered a similar problem with size of template versus the size of your hand in the screen. As you move your hand further or closer to the camera, its size on the screen and thus in the taken image changes. However, the template is unable to change size, and this can cause the images to no longer match due to a large difference in size. Overall, this technique was unsuccessful and I moved on to the idea that ended up being successful. In the technique I ended up using there was still some experimentation involved. For instance, when I first got the program working, the boundary rectangle appeared as I moved my hand, but on the opposite side of the screen. I also discovered that the rectangle only detected movement on one half of the distance covered by the camera. After some testing on rectangle starting position, dimensions, and contouring the various images, these problems were resolved.


Results

I tested the algorithm with a variety of hand positions on both hands and at different angles.

Results

Hand Position Detected Skin Color Image
One Finger
Two Fingers
Three Fingers
Four Fingers
Five Fingers
Shaka
Psi
Psi with Other Hand
Psi with Hand Tilted
Psi with Hand Flipped

Confusion Matrix

Gesture One Finger Two Fingers Three Fingers Four Fingers Five Fingers
One Finger 4 2 0 0 0
Two Fingers 0 5 1 0 0
Three Fingers 0 0 4 2 0
Four Fingers 0 0 0 6 0
Five Fingers 0 0 0 0 6


Discussion

I believe that this was a successful method to solve the problem given. I did notice that as the computation involved in detecting the hand position, the slower the frame rate was in the video taken from the webcam. This makes sense, as with increased computations, more time is required to complete these computations before updating the image presented on the screen. This makes me consider whether there is a more efficient way to solve this problem than the way I did. Perhaps this is using a completely different technique than contouring even. I think this is an interesting thing to look into in the future.


Conclusions

This was quite a difficult problem to solve, even with the lab discussions giving us a good point to start from. When discussing these techniques in class or sketching out the problem on paper, it does not seem incredibly difficult, however the implementation is quite difficult. I did a lot of research both reading papers and watching videos with demonstrations of this situation both to get more familiar with the topic and to get ideas on how to implement the solution both code-wise and visually. I think the key for developing this algorithm further is to continue to use the skin color detection function. I tried various methods with grayscale and blurring and thresholds, but none of those methods worked as well for picking up solely hand motion than using the skin detecting function used in lab. Overall I think this was a very interesting problem to work on.