Deep Learning, Applied. Project #1

Patrick Rodriguez  |  Posted on Sun 22 January 2017 in programming

Food Classification with Deep Learning in Keras / Tensorflow

Computer, what am I eating anyway?

https://www.flickr.com/photos/torley/3294481634

In [5]:
from IPython.display import HTML, Image

url = 'http://stratospark.com/demos/food-101/'
el = '<' + 'iframe src="{}"'.format(url) + ' width="100%" height=600>' # prevent notebook render bug
HTML(el)
Out[5]:

If you are reading this on GitHub, the demo looks like this. Please follow the link below to view the live demo on my blog.

In [4]:
Image('demo.jpg')
Out[4]:

Introduction

Convolutional Neural Networks (CNN), a technique within the broader Deep Learning field, have been a revolutionary force in Computer Vision applications, especially in the past half-decade or so. One main use-case is that of image classification, e.g. determining whether a picture is that of a dog or cat.

Convolutional Neural Network

You don't have to limit yourself to a binary classifier of course; CNNs can easily scale to thousands of different classes, as seen in the well-known ImageNet dataset of 1000 classes, used to benchmark computer vision algorithm performance.

In the past couple of years, these cutting edge techniques have started to become available to the broader software development community. Industrial strength packages such as Tensorflow have given us the same building blocks that Google uses to write deep learning applications for embedded/mobile devices to scalable clusters in the cloud -- Without having to handcode the GPU matrix operations, partial derivative gradients, and stochastic optimizers that make efficient applications possible.

On top of all of this, are user-friendly APIs such as Keras that abstract away some of the lower level details and allow us to focus on rapidly prototyping a deep learning computation graph. Much like we would mix and match Legos to get a desired result.

Project Description

As an introductory project for myself, I chose to use a pre-trained image classifier that comes with Keras, and retrain it on a dataset that I find interesting. I'm very much into good food and home cooking, so something along those lines was appetizing.

In the paper, Food-101 – Mining Discriminative Components with Random Forests, they introduce the Food-101 dataset. There are 101 different classes of food, with 1000 labeled images per class available for supervised training.

Food-101 cover image

Approach

I was inspired by this Keras blog post: Building powerful image classification models using very little data, and a related script I found on github: keras-finetuning.

I built a system recently for the purpose of experimenting with Deep Learning. The key components are an Nvidia Titan X Pascal w/12 GB of memory, 96 GB of system RAM, as well as a 12-core Intel Core i7. It is running 64-bit Ubuntu 16.04 and using the Anaconda Python distribution. Unfortunately, you won't be able to follow along with this notebook on your own system unless you have enough RAM. In the future, I would like to learn how to handle larger than RAM datasets in a performant way. Please get in touch if you have any ideas!

I've spent about 1 month on and off building this project, trying to train dozens of models and exploring various areas such as multiprocessing for faster image augmentation. This is a cleaned up version of the notebook that contains my best performing model as of Jan 22, 2017.

Results

After fine-tuning a pre-trained Google InceptionV3 model, I was able to achieve about 82.03% Top-1 Accuracy on the test set using a single crop per item. Using 10 crops per example and taking the most frequent predicted class(es), I was able to achieve 86.97% Top-1 Accuracy and 97.42% Top-5 Accuracy

Others have been able to achieve more accurate results:

Thoughts

  • Loading a large amount of data into memory, how to avoid?
  • Saving the data into h5py file for out of band processing?
  • Using Dask for distributed processing?
  • Improving multiprocessing image augmentation?

Implemented! Check out: http://blog.stratospark.com/creating-a-deep-learning-ios-app-with-keras-and-tensorflow.html

Experiment

Loading and Preprocessing Dataset

Let's import all of the packages needed for the rest of the notebook:

In [6]:
import matplotlib.pyplot as plt
import matplotlib.image as img
import numpy as np
from scipy.misc import imresize

%matplotlib inline

import os
from os import listdir
from os.path import isfile, join
import shutil
import stat
import collections
from collections import defaultdict

from ipywidgets import interact, interactive, fixed
import ipywidgets as widgets

import h5py
from sklearn.model_selection import train_test_split
from keras.utils.np_utils import to_categorical
from keras.applications.inception_v3 import preprocess_input
from keras.models import load_model
Using TensorFlow backend.

Download the dataset and extract it within the notebook folder. It may be easier to do this in a separate terminal window.

In [1]:
# !wget http://data.vision.ee.ethz.ch/cvl/food-101.tar.gz
In [2]:
# !tar xzvf food-101.tar.gz

Let's see what sort of foods are represented here:

In [3]:
!ls food-101/images
apple_pie	    eggs_benedict	     onion_rings
baby_back_ribs	    escargots		     oysters
baklava		    falafel		     pad_thai
beef_carpaccio	    filet_mignon	     paella
beef_tartare	    fish_and_chips	     pancakes
beet_salad	    foie_gras		     panna_cotta
beignets	    french_fries	     peking_duck
bibimbap	    french_onion_soup	     pho
bread_pudding	    french_toast	     pizza
breakfast_burrito   fried_calamari	     pork_chop
bruschetta	    fried_rice		     poutine
caesar_salad	    frozen_yogurt	     prime_rib
cannoli		    garlic_bread	     pulled_pork_sandwich
caprese_salad	    gnocchi		     ramen
carrot_cake	    greek_salad		     ravioli
ceviche		    grilled_cheese_sandwich  red_velvet_cake
cheesecake	    grilled_salmon	     risotto
cheese_plate	    guacamole		     samosa
chicken_curry	    gyoza		     sashimi
chicken_quesadilla  hamburger		     scallops
chicken_wings	    hot_and_sour_soup	     seaweed_salad
chocolate_cake	    hot_dog		     shrimp_and_grits
chocolate_mousse    huevos_rancheros	     spaghetti_bolognese
churros		    hummus		     spaghetti_carbonara
clam_chowder	    ice_cream		     spring_rolls
club_sandwich	    lasagna		     steak
crab_cakes	    lobster_bisque	     strawberry_shortcake
creme_brulee	    lobster_roll_sandwich    sushi
croque_madame	    macaroni_and_cheese      tacos
cup_cakes	    macarons		     takoyaki
deviled_eggs	    miso_soup		     tiramisu
donuts		    mussels		     tuna_tartare
dumplings	    nachos		     waffles
edamame		    omelette
In [4]:
!ls food-101/images/apple_pie/ | head -10
1005649.jpg
1011328.jpg
101251.jpg
1014775.jpg
1026328.jpg
1028787.jpg
1034399.jpg
103801.jpg
1038694.jpg
1043283.jpg
ls: write error: Broken pipe

Let's look at some random images from each food class. You can right click and open the image in a new window or save it in order to see it at a higher resolution.

In [241]:
root_dir = 'food-101/images/'
rows = 17
cols = 6
fig, ax = plt.subplots(rows, cols, frameon=False, figsize=(15, 25))
fig.suptitle('Random Image from Each Food Class', fontsize=20)
sorted_food_dirs = sorted(os.listdir(root_dir))
for i in range(rows):
    for j in range(cols):
        try:
            food_dir = sorted_food_dirs[i*cols + j]
        except:
            break
        all_files = os.listdir(os.path.join(root_dir, food_dir))
        rand_img = np.random.choice(all_files)
        img = plt.imread(os.path.join(root_dir, food_dir, rand_img))
        ax[i][j].imshow(img)
        ec = (0, .6, .1)
        fc = (0, .7, .2)
        ax[i][j].text(0, -20, food_dir, size=10, rotation=0,
                ha="left", va="top", 
                bbox=dict(boxstyle="round", ec=ec, fc=fc))
plt.setp(ax, xticks=[], yticks=[])
plt.tight_layout(rect=[0, 0.03, 1, 0.95])