640 P2 Report

CS 640 Programming assignment 2
Kaihong Wang
Date: 4/7/2019


Problem Definition

In this project, I am planning to detect whether an image contains any food, especially Kenyan food. Considering that training a classifier is computationally cheaper and easier, it would be a better choice to train a food/non-food classifier instead of an object detector.


Background Survey

Food recognition is one of the most popular topic and application in deep learning area and some state-of-the-art models are able to outperform human nowadays. However in real world application, people should not assume every image contains food and we only want to recognize food among images that contain them. Therefore, food/non-food classification becomes one of the most important prerequisite of successfully recognizing a food.

There are several popular food/non-food datasets used by most food/non-food classifier, such as IFD and FCD datasets. IFD is short for Instagram Food/Non-Food Dataset, which is collected on Instagram, containing 5,000 food and 5,000 non-food images, while FCD is short for Food-101 and Caltech-256 Dataset, containing 25,000 food images and 25,000 non-food images. Both datasets share a common advantage that they contain a large number of food images and diverse non-food images so that trained model is less likely to be confused by rare non-food images.

However, we should also notice that food should be diverse enough to train an accurate food detector since people around the globe have different diatery habit and even diverse definition about food but unfortunately most popluar food datasets contain only European and East Asian food. Noting that Indian Ocean Rim countries such as India, Kenya and some Middle East countries have some quite different food, I am planning to set up a Kenyan Food Dataset(KFD) and try to imporve the performance of food/non-food classification when processing exotic food images.


Baseline

Hokuto et al. proposed a CNN-NIN model training on IFD anf FCD datasets in: Highly Accurate Food/Non-Food Image Classification based on a Deep Convolutional Neural Network and achieved accuracy 95.1% on IFD and 96.1% on FCD respectively. However this paper is published in 2015 and in 2019 we are able to train data on better model. By merging FCD and IFD datasets I set up a baseline dataset called FNF, I then trained a ResNet 101 on it and the accuracy reach 98.3%.


Method

I collected Kenyan images of food using Instagram API module and scrape about 20 kinds of different popular Kenyan food using hashtag searching and downloaded randomly 60,000 images posted in Kenya as non-food images. After manually inspection, I set up a Kenyan Food Dataset(KFD) containing 30,000 food images and 30,000 non-food images. Then I trained a ResNet 101 on KFD using Tensorflow slim module. The network is pretrained on ImageNet and use Adam as optimizer.


Experiments

I conducted both single dataset evaluation and cross dataset evaluation to evaluate the performance of my model on datasets.


Results

1) Error rate of model training on FNF dataset.

ResNet 101 was trained on FNF dataset for about 30 epochs and tested on FNF dataset.

Error rate on FNF dataset

Dataset split
Training Set Testing Set
Error rate 0.11% 1.65%

2) Error rate of model training on KFD dataset.

ResNet 101 was trained on KFD dataset for about 30 epochs and tested on KFD dataset.

Error rate on KFD dataset

Dataset split
Training Set Testing Set
Error rate 0.44% 0.62%

3) Cross dataset evaluation

ResNet 101 was trained on FNF dataset for about 30 epochs and tested on KFD dataset.

Error rate on KFD dataset

Dataset split
Training Set Testing Set
Error rate 2.32% 2.18%

ResNet 101 was trained on KFD dataset for about 30 epochs and tested on FNF dataset.

Error rate on FNF dataset

Dataset split
Training Set Testing Set
Error rate 3.36% 3.33%

4) Performance of model on the combination of FNF and KFD.

ResNet 101 was trained on BUFD dataset for about 38 epochs and tested on BUFD dataset.

Error rate on BUFD dataset

Dataset split
Training Set Testing Set
Error rate 0.50% 1.11%

ResNet 101 was trained on FNF dataset for about 38 epochs and tested on BUFD dataset.

Error rate on BUFD dataset

Dataset split
Training Set Testing Set
Error rate 1.33% 1.34%

ResNet 101 was trained on KFD dataset for about 38 epochs and tested on BUFD dataset.

Error rate on BUFD dataset

Dataset split
Training Set Testing Set
Error rate 1.92% 2.19%

ResNet 101 was trained on BUFD dataset for about 38 epochs and tested on FNF dataset.

Error rate on FNF dataset

Dataset split
Training Set Testing Set
Error rate 0.61% 0.66%

ResNet 101 was trained on BUFD dataset for about 38 epochs and tested on KFD dataset.

Error rate on KFD dataset

Dataset split
Training Set Testing Set
Error rate 0.52% 0.37%


Discussion


Conclusions

In this project, I proposed a new dataset collected by myself, which is dedicated to detect food from Kenya and other Indian Ocean Rim counties and the experiments above prove the effectiveness of this dataset. Also, I found that a more diverse food image dataset is helpful to improve the performance of recognizing not only augmented part but also the original part of the dataset.


Credits and Bibliography

Kagaya H, Aizawa K. Highly accurate food/non-food image classification based on a deep convolutional neural network[C]//International Conference on Image Analysis and Processing. Springer, Cham, 2015: 350-357. Merler M, Wu H, Uceda-Sosa R, et al. Snap, Eat, RepEat: a food recognition engine for dietary logging[C]//Proceedings of the 2nd international workshop on multimedia assisted dietary management. ACM, 2016: 31-40.