Computer Vision Project: Opencv Facial Recognition Optimized for the Raspberry Pi

Team members: Emilio Garcia

Problem Definition

Facial Recognition is a fairly well established field of study, and currently has a stable library in opencv. However, it is rather heavy. I will be experimenting with the tools that are available in opencv in order to create a more lightweight implementation of these algorithms that is suitable to be run on more hardware bound systems, like a Raspberry Pi system on a chip.

Desired Results

From a realistic point of view, an acceptable outcome for me would be if I am able to run my program with a simple input, perhaps a few seconds of input from the webcam, then process those frames to generate and estimated match based on an average eigen face. I would be ok with the initiation of this algorithm being triggered by just launching the program for now, but in the future, I hope to be able to run a real time algorith that check for faces with a camera, and would be able to launch this algorithm when it detects faces in its field of view. Anyway, since this version of the software is running on a well equipped desktop, rather than a weaker SOC computer, I expect my algorithm to be able to run in a second or less, since it would likely be much more when run on a SOC. Lastly, I expect the accuracy of my prediction to be around 90%. This margin of error is a little high, but in the spirit of realism, I think that is a fair measure of success for this project. I don't expect it to know how to handle itself in situations where ther e is more than one detectable face in the camera's field of view.

Results and Reflection

My Results

In the time I was given to work on this project, I was able to build out some of the core components that are required to make this work, as well as make them able to run themselves in most cases. Each component of the facial recognition process was reduced to a series of statically callable functions that could easily be called autonomously by the system in a more developed version of this application. I will give a high level overview of the functions I have, and how they interact with each other below.

Database Info

In the current state of deveolpment, the application expects a text file named "database.txt" to already exist in its working directory. Without this, it will not work. The database must be in a csv format, however, the delimiter does not necesarily need to be a comma. I have built in sufficient extensibility to allow you to use whatever delimiter you want. However, please note that I did not have time to test this with different delimiters, so if you do choose to try this out and it doesnt work, let me know so I can patch it. The database table is organized as follows:

User Id    |   User's Name   |   Path to samples   |   Number of samples for user   |   Dirty Bit

The User Id is a unique integer Id that is assigned to each entry for neat book keeping and for the faceRecognizer to use to index each User. The User's Name is a string containing the name the user has chosen to be identified as. The Path to samples is exactly what it sounds like. Its the path from the current working directory of the build to the set of samples for this user. The number of samples is used for iterating over the sample images. Lastly, the dirty bit keeps track of whether or not a row has been modified, so that the system knows to update the FaceRecognizer data.

detectFaces(Mat& frame, Mat& equalizedFace, vector<Rect>& positions)

This function reads from the frame matrix pointer, then uses Haar Cascades to identify any faces in the image. Since this software is meant to be single subject, it will isolate the largest face, and then crop it out using my crop function (see below) and write that to a new matrix. Then, it will greyscale and equalize that image and write it to the location of the mat equalizedFace. It will also return the vector of face positions returned from the Haar Cascade to the vector pointer positions.

readCSV(vector<vector<string>>& file, string path = DATABASE, char separator = ',')

This is a handy little helper funcion that is critical for reading and writing to the database.txt file. It reads in a file from the path parameter, which is defaulted to my databases path here. It expects some sort of csv format, so you can specify the seperator with the seperator param, which is defaulted to comma. What this function essentially does, is read the text file, and break it into a 2d array that allows you easily iterate through lines, and within them: [line][parameter]. This gets stored at the provided pointer in the file parameter.

resizeImage(Mat& image, Mat& output, bool preserveAspectRatio, int width = 225, int height = 300)

Opencv already has a resize funcion, however, it does not let you resize with respect to an aspect ratio, resulting in wonky looking images that are likely to cause problems with the face recognizer. Thus, I made this function to preserve the aspect ratio of the images when resizing them, resulting in all the samples collected being uniform in size, as well as the aspect ratio, and parts of the face that they capture. This makes the samples much better for the face recognizer to learn from. I think the parameters are fairly self explanitory for this one, so i will not doccument them.

overwriteDatabase(vector<<vector>string>>& file)

This reads in a 2d vector representing a database file and overwrites the current database file. It expects the format generated by the readCSV function, and in its current form is statically assigned to write to the file: database.txt with comma delimiters.

learnFaces()

This function is essential to the facial recognition process. It reads through the database file, looking for entries that have a dirty bit marked false. This is how the system knows that the face recognizer has not yet learned this face. In its current state of developmetn, this function works fine, as long as there are at least 2 datasets marked as dirty. This is because the algorithm for generating Fisher Faces requires at least 2 data sets in order to perform Linear Discriminant Analysis, otherwise it crashes. I also now realize, that each time a dataset is added, the entirety of the data set, must be relearned in order to properly perform LDA over it and add it to the recognizer. If I had more time, I would definately implement these corrections to ensure correct recognition, as well as self sufficient performance.

sampleCapture(string fileName, string name)

SampleCapture is currently the primary means for capturing samples of people's faces to learn. In the applications current state, it interacts exclusively with the command line UI that I built for this project. You give this function a fileName, which serves as a path to the set of samples you are about to capture in a directory called samples. In order to keep this platform neutral, and problems with windows permisions, I did not implement any directory manipulation in this code, so if there is not a directory named samples in the current working directory that you run this code from, then it will exit without capturing anything. The name parameter is meant to serve as the name the person will be recognized as in the database. This can be any string. At the start of this function, it will turn on the default camera on the computer. Then, it will enter a event loop where it runs the current frame through the detectFaces function, taking the output, and drawing a green bounding box around the detected face to implicate that a positive detection is occuring. As long as the system detects a face, it will let you press the space bar to capture the current processed face as a sample. You must press the escape key to exit the event loop. Then, upon exiting, it will save all the captured images to the samples directory, named fileName#, where # is a number from 0 to the number of images captured for easy referencing. Then, it will overwrite the database to include the new sample set and run the learn function. Naturally, this will crash whenever there is only one sample set that is added to the system, which is a major design flaw, however, surprisngly this doesnt cause any damage to the samples or database. It is mostly inconvenient and stupid. Besides this, when the function returns, it throws a segmentation fault. Due to my beginner level knowledge of c++ I was unable to resolve this problem, however was able to identify that a memory leak was occuring in the heap. This issue does not necessarily hinder the appliactions functionality, however, it does prevent you from running the application from the UI withoug having to force quit it and restart it again, which means an IOT device would definately not be able to self sufficiently run it in its current state. It will definately need more work.

facialRecognition(bool DEBUG = true)

Finally, the function that puts it all together! Using the default recording device on the computer, this function will attempt to boot up a camera. Then, it will run the faceDetection funtion to pick out the closest face in the frame and run it through the faceDetector. In the current version of my application, I use opencv's FaceRecognizer function. At the start of this project, I intended to use eigenfaces for facial recognition, however, was promptly persuaded to use Fisher Faces instead by the IEEE article linked above about EigneFaces vs. FisherFaces. Fisher Faces are more detailed, can be generated with smaller samples sets, and are less prone to lighting differences in comparison to eigenfaces. This seemed to suit my low disk consumption and low power hardware optimization goal for this project. Anyway, to get back on track, this function will take the cropped and equalized face taken from the faceDetector function and run it through the faceRecognizer. The faceRecognizer will then output an interger corresponding to one of the entrants User Id, and a confidence for its prediction. Using the database, the console will show who the FaceRecognizer gusssed it sees, and with what confidence it guessed with. Currently, I have been entirely unable to get it to recognize any faces at all! Besides that, it also seg faults when it closes. I have also been able to pinpoint this to a heap corruption, however, havnt been able to resolve it.