CS585 Final Project


Title: Monitoring population psychology through face data capturing and analysis
Team members:
- Jun Li   (U73344054, junli@bu.edu)
- Chengyuan E   (U40616733, ecyecy@bu.edu)
Date: 4/20/2020


Motivation

Since we went from tribalism to globalization, and with the development of computer vision techniques and computing power, we nowadays have the capability for understanding and analyzing human population’s psychology states when they are outside their homes. And such evolvement and possibility can show huge potential in plenty of fields, such as human psychology dynamic study, public healthcare, evolutionary theory, game theory, business profit potential, etc. But I in person definitely do not want this to be a controlling tool by the upper classes and the constructions that control the world. But the cable or so called NWO have already achieved such technology or even much beyond, so as this CS585 course, we are just creating this tool as our project, but not any further. So again, this project, It’s for understanding the people’s psychology in certain areas and time period and later correlate the result to the environmental features, whether it's natural or artificial. But this correlation would be outside this project. And then, as mentioned before, it has great potential to benefit the fields as mentioned here. The papers shown below are the ones that have been done by others and we took some knowledge to start our project also as a motivation.


Project Problem Definition

Monitor population psychology through face data capturing and analysis in a certain period of time within a certain area. Ideally to set up cameras on different areas and record the pedestrians to get their face images, and then analyze the facial features based on neural network and computer vision techniques combined to get the ‘emotion’ of each individual. And then apply it to our own algorithm to get the ‘psychology states’ or ‘scores’ for a certain area in a certain period of time.

Due to the current COV-19 situation, not many people on the street, and the only crowd on the street usually wears masks. So it creates difficulties to get an ideal result because of the sparse population density and the covered faces. Instead, we decide to use videos of walking in different cities which contain passengers, which is a close simulation of our data source.


Data source

First-person youtube videos where the youtubers hold their camera, walk in different cities such as NYC, Moscow, Miami to record pedestrians and views. In those videos, pedestrians have relatively clear facial features, moderate distance to the camera, and trivial camera shakes.

Some example videos can be reached by the following URLs:

https://www.youtube.com/watch?v=aSyipwo0BzA

https://www.youtube.com/watch?v=HP8mnRQw0yk

https://www.youtube.com/watch?v=o9H08lCCV-U&t=1s


Methods

  1. Preprocess the videos: capture frames from video by OpenCV
  2. Analyze frames
    1. Face detection phase
      • Haar Feature-based Cascade Classifiers, using OpenCV
      • 'face_recognition', a Third-party library by Adam Geitgey, using HOG method
    2. Facial emotion analysis
      • Dataset: FER2013 by Kaggle
      • Deep learning method:
        • Reshape detected faces into 48x48
        • Train on a deep CNN created by us
        • Classify facial emotions into 7 classes
        • Around 65% accuracy
      • Draw accumulative psycology state/score on the frame
  3. Rebuild video: Rebuild video by frames generated by face detection phase

Results Summary

Generated result for NYC.mp4

Generated result for Paris.mp4

Generated result for Moscow_1.mp4

Generated result for Moscow_2.mp4

Generated result for Moscow_3.mp4


Discussion

  1. Results are relatively good
  2. Accuracy for facial emotion analysis could be improved (find better dataset for model training)
  3. Need to train model in advance (this could spend some time and computation)
  4. Have better result on fixed camera (less blurry faces)
  5. Use tracking methods to achieve psycology state for each person in the video

Conclusion

Our tool generated intuitively good result. And more reasearches are needed to give more quantitive result(mapping from statistics to a certain score)