Baseline - FOODC

Baseline Submission for the Challenge FOODC

Author - Pulkit Gera

To open this notebook on Google Computing platform Colab, click below!

Open In Colab

Download the files

These include the train test images as well the csv indexing them

In [0]:
!wget -q https://s3.eu-central-1.wasabisys.com/aicrowd-practice-challenges/public/foodc/v0.1/train_images.zip
!wget -q https://s3.eu-central-1.wasabisys.com/aicrowd-practice-challenges/public/foodc/v0.1/test_images.zip
!wget -q https://s3.eu-central-1.wasabisys.com/aicrowd-practice-challenges/public/foodc/v0.1/train.csv
!wget -q https://s3.eu-central-1.wasabisys.com/aicrowd-practice-challenges/public/foodc/v0.1/test.csv

We create directories and unzip the images

In [0]:
!mkdir data
!mkdir data/test
!mkdir data/train
!unzip train_images -d data/train
!unzip test_images -d data/test

Import necessary packages

In [0]:
import torchvision.transforms as transforms
from torch.utils.data.sampler import SubsetRandomSampler
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import TensorDataset, DataLoader, Dataset
import torchvision
from torchvision import models
import torch.optim as optim
import pandas as pd
import numpy as np
import cv2
import os
from sklearn import preprocessing
import matplotlib.pyplot as plt
%matplotlib inline

Loading Data

In pytorch we can directly load our files into torchvision(the library which creates the object) or create a custom class to load data. The class must have __init__ , __len__ and __getitem__ functions. We create a custom dataloader to suit our needs. More info on custom loaders can be read here

In [0]:
class FoodData(Dataset):
    def __init__(self,data_list,data_dir = './',transform=None,train=True):
        super().__init__()
        self.data_list = data_list
        self.data_dir = data_dir
        self.transform = transform
        self.train = train
    
    def __len__(self):
        return self.data_list.shape[0]
    
    def __getitem__(self,item):
        if self.train:
          img_name,label = self.data_list.iloc[item]
        else:
          img_name = self.data_list.iloc[item]['ImageId']
        img_path = os.path.join(self.data_dir,img_name)
        img = cv2.imread(img_path,1)
        img = cv2.resize(img,(256,256))
        if self.transform is not None:
            img = self.transform(img)
        if self.train:
          return {
              'gt' : img,
              'label' : torch.tensor(label)

          }
        else:
          return {
              'gt':img
          }

We first convert the data labels into encodings using Label Encoders. This basically converts labels into number encodings. This is an important step as without it we cannot train our network

In [0]:
train = pd.read_csv('train.csv')
le = preprocessing.LabelEncoder()
targets = le.fit_transform(train['ClassName'])
ntrain = train
ntrain['ClassName'] = targets

We load our train data and some necessary augementations like converting to PIL image, converting to tensors and normalizing them across channels. We can add more augementations such as Random Flip, Random Rotation, etc more on which can be found here

In [0]:
transforms_train = transforms.Compose([
    transforms.ToPILImage(),
    
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5,0.5,0.5))
])
train_path = 'data/train/train_images'
train_data = FoodData(data_list= ntrain,data_dir = train_path,transform = transforms_train)

EDA

Let us do some exploratory data analysis. The idea is to see the class distribution, how the images are and much more.

In [79]:
train = pd.read_csv('train.csv')
num = train['ClassName'].value_counts()
classes = train['ClassName'].unique()
print("Percentage of each class")
for cl in classes:
  print(cl,'\t',num[cl]/train.shape[0]*100,"%")
Percentage of each class
water 	 9.25667703528907 %
pizza-margherita-baked 	 1.179877721763381 %
broccoli 	 0.9009975329829454 %
salad-leaf-salad-green 	 5.738496192212807 %
egg 	 2.2417676713504235 %
butter 	 3.71125174300118 %
bread-white 	 6.382065858629196 %
apple 	 2.0486967714255067 %
dark-chocolate 	 0.9439021774107046 %
white-coffee-with-caffeine 	 1.3085916550466588 %
sweet-pepper 	 0.9009975329829454 %
mixed-salad-chopped-without-sauce 	 1.8127212270728308 %
tomato-sauce 	 1.179877721763381 %
cucumber 	 1.1476992384425615 %
cheese 	 1.4694840716507562 %
pasta-spaghetti 	 1.040437627373163 %
rice 	 2.7458972433765956 %
zucchini 	 0.9653544996245843 %
salmon 	 0.5470342164539311 %
mixed-vegetables 	 2.542100182344739 %
espresso-with-caffeine 	 2.0916014158532663 %
banana 	 1.9414351603561086 %
strawberries 	 0.9331760163037649 %
mayonnaise 	 0.4612249275984125 %
almonds 	 0.740105116378848 %
bread-wholemeal 	 4.269012120562051 %
wine-white 	 1.619650327147914 %
hard-cheese 	 1.2013300439772605 %
ham-raw 	 0.7079266330580285 %
tomato 	 3.8399656762844576 %
french-beans 	 0.8044620830204869 %
mandarine 	 0.740105116378848 %
wine-red 	 2.585004826772498 %
potatoes-steamed 	 1.673281132682613 %
croissant 	 0.8044620830204869 %
carrot 	 3.185669848761128 %
salami 	 0.5255818942400515 %
boisson-au-glucose-50g 	 0.9117236940898853 %
biscuits 	 0.7293789552719082 %
corn 	 0.39686796095677357 %
leaf-spinach 	 0.9331760163037649 %
tea-green 	 0.740105116378848 %
chips-french-fries 	 1.4587579105438164 %
parmesan 	 0.7293789552719082 %
beer 	 0.8580928885551861 %
bread-french-white-flour 	 0.6542958275233294 %
coffee-with-caffeine 	 4.043762737316314 %
chicken 	 1.1369730773356215 %
soft-cheese 	 0.5148557331331117 %
tea 	 1.8985305159283494 %
avocado 	 0.9439021774107046 %
bread-sourdough 	 0.6757481497372091 %
gruyere 	 0.7615574385927276 %
sauce-savoury 	 0.6542958275233294 %
honey 	 0.6972004719510887 %
mixed-nuts 	 0.868819049662126 %
jam 	 1.7483642604311918 %
bread-whole-wheat 	 0.7937359219135471 %
water-mineral 	 0.922449855196825 %
onion 	 0.4397726053845329 %
pickle 	 0.3003325109943151 %

We observe that water is the most popular class although the distribution is not that skewed. Let us plot the images of white flour french bread and french fries and have a look at the kind of images we have

In [57]:
imgs = train.loc[train['ClassName'] == 'bread-french-white-flour']
plt.figure(figsize=(10,10))
for i in range(imgs[:16].shape[0]):
  path = imgs.iloc[i]['ImageId']
  image = cv2.imread(os.path.join(train_path,path),1)
  image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
  plt.subplot(4,4,i+1)
  plt.axis('off')
  plt.imshow(image)
In [77]:
imgs = train.loc[train['ClassName'] == 'chips-french-fries']
plt.figure(figsize=(10,10))
for i in range(imgs[:16].shape[0]):
  path = imgs.iloc[i]['ImageId']
  image = cv2.imread(os.path.join(train_path,path),1)
  image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
  plt.subplot(4,4,i+1)
  plt.axis('off')
  plt.imshow(image)

Split Data into Train and Validation

Now we want to see how well our model is performing, but we dont have the test data labels with us to check. What do we do ? So we split our dataset into train and validation. The idea is that we test our classifier on validation set in order to get an idea of how well our classifier works. This way we can also ensure that we dont overfit on the train dataset. There are many ways to do validation like k-fold,leave one out, etc

We also make dataloaders which basically create minibatches of dataset which are used in each epoch

In [0]:
batch = 128
valid_size = 0.2
num = train_data.__len__()
# Dividing the indices for train and cross validation
indices = list(range(num))
np.random.shuffle(indices)
split = int(np.floor(valid_size*num))
train_idx,valid_idx = indices[split:], indices[:split]

#Create Samplers
train_sampler = SubsetRandomSampler(train_idx)
valid_sampler = SubsetRandomSampler(valid_idx)

train_loader = DataLoader(train_data, batch_size = batch, sampler = train_sampler)
valid_loader = DataLoader(train_data, batch_size = batch, sampler = valid_sampler)

Here we load test images. Note: This file will not have any labels with it

In [0]:
transforms_test = transforms.Compose([
    transforms.ToPILImage(),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5) , (0.5,0.5,0.5))
])
test_path = 'data/test/test_images'
test = pd.read_csv('test.csv')
test_data = FoodData(data_list= test,data_dir = test_path,transform = transforms_test,train=False)

test_loader = DataLoader(test_data, batch_size=batch, shuffle=False)

Here we check if we have a GPU or not. If we have we just need to shift our data and model to GPU for faster computations.

In [11]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Assuming that we are on a CUDA machine, this should print a CUDA device:

print(device)
cuda:0

Define the Model

Now we come to the juicy part. We define our model here. We need to create a class with __init__ and forward functions which define the layers and forward pass respectively. We can also load pretrained models and freeze their layers and add more layers on top of it, to train them. More on pretrained models with pytorch here and making models here.

In [0]:
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
  # Define layers here
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 61 * 61, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 61)

    def forward(self, x):
      # Forward pass
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 61 * 61)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

Here we define our model object along with our optimizer and error function. Typically for multi class classification we use Cross Entropy Loss. More about different types of losses are here.
We use the popular Adam optimizer with its default parameters. There are other optimizers like SGD, RMSPROP, Adamax,etc. You can have a detailed look at optimizers here

In [0]:
model = Net().to(device)
error = nn.CrossEntropyLoss().to(device)
optimizer = optim.Adam(model.parameters())

Train

Alright enough talk and time to train. We define the number of epochs and train the model. An epoch is a forward pass and backward pass of all the data points. An epoch consists of iterations which depend on batch size. So basically we take a batch, get its output, do a backward pass and let the optimizer take a step. This is the workflow for any pytorch code.

Validate

Now after an epoch ends, we check with validation and do the same steps except backward pass on loss and optimizer step. If we get a reduction in validation loss, we save the model. This is sort of an early stopping.

In [14]:
n_epochs = 5
valid_loss_min = np.Inf

train_losses = []
valid_losses = []

for epoch in range(n_epochs):
    train_loss = 0.0
    valid_loss = 0.0
    
    model.train()
    for images in train_loader:
        data = images['gt'].squeeze(0).to(device)
        # data = data.squeeze(0)
        target = images['label'].to(device)
#             clear the gradients of all optimized variables
        optimizer.zero_grad()
#         forward pass the model
        output = model(data)
#     backward pass the model
        loss = error(output,target)
        loss.backward()
#         Perform a single optimization step
        optimizer.step()
        train_loss += loss.item()*data.size(0)
        
    
    
    
    model.eval()
    for images in valid_loader:
        data = images['gt'].squeeze(0).to(device)
        target = images['label'].to(device)
#         forward pass now
        output = model(data)
#         calculate the branch loss
        loss = error(output, target)
#     update average validation loss
        valid_loss += loss.item()*data.size(0)
    
    train_loss /= len(train_loader.sampler)
    valid_loss /= len(valid_loader.sampler)
    
    train_losses.append(train_loss)
    valid_losses.append(valid_loss)
    
    print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
        epoch, train_loss, valid_loss))
    
    if valid_loss <= valid_loss_min:
        print("Validation Loss decreased {:0.6f} -> {:0.6f}".format(valid_loss_min,valid_loss))
        valid_loss_min = valid_loss
        torch.save(model.state_dict(), 'best_model_so_far.pth')
Epoch: 0 	Training Loss: 4.116804 	Validation Loss: 4.030159
Validation Loss decreased inf -> 4.030159
Epoch: 1 	Training Loss: 3.891067 	Validation Loss: 3.763175
Validation Loss decreased 4.030159 -> 3.763175
Epoch: 2 	Training Loss: 3.794735 	Validation Loss: 3.759849
Validation Loss decreased 3.763175 -> 3.759849
Epoch: 3 	Training Loss: 3.792089 	Validation Loss: 3.758870
Validation Loss decreased 3.759849 -> 3.758870
Epoch: 4 	Training Loss: 3.791188 	Validation Loss: 3.757242
Validation Loss decreased 3.758870 -> 3.757242

Predict on Validation

Now we predict our trained model on the validation set and evaluate our model

In [15]:
model.load_state_dict(torch.load('best_model_so_far.pth'))
model.eval()
correct = 0
total = 0
pred_list = []
correct_list = []
with torch.no_grad():
    for images in valid_loader:
        data = images['gt'].squeeze(0).to(device)
        target = images['label'].to(device)
        outputs = model(data)
        _, predicted = torch.max(outputs.data, 1)
        total += target.size(0)
        pr = predicted.detach().cpu().numpy()
        for i in pr:
          pred_list.append(i)
        tg = target.detach().cpu().numpy()
        for i in tg:
          correct_list.append(i)
        correct += (predicted == target).sum().item()

print('Accuracy of the network on the 10000 test images: %f %%' % (
    100 * correct / total))
Accuracy of the network on the 10000 test images: 9.388412 %

Evaluate the Performance

We use the same metrics as that will be used for the test set.
F1 score and Log Loss are the metrics for this challenge

In [26]:
from sklearn.metrics import f1_score,precision_score,log_loss   
print("F1 score :",f1_score(correct_list,pred_list,average='micro'))
F1 score : 0.09388412017167383

Predict on test set

Time for the moment of truth! Predict on test set and time to make the submission.

In [0]:
model.load_state_dict(torch.load('best_model_so_far.pth'))
model.eval()

preds = []
with torch.no_grad():
    for images in test_loader:
        data = images['gt'].squeeze(0).to(device)
        outputs = model(data)
        _, predicted = torch.max(outputs.data, 1)
        pr = predicted.detach().cpu().numpy()
        for i in pr:
          preds.append(i)

Save it in correct format

In [0]:
# Create Submission file        
df = pd.DataFrame(le.inverse_transform(preds),columns=['ClassName'])
df.to_csv('submission.csv',index=False)

To download the generated in collab csv run the below command

In [0]:
from google.colab import files
files.download('submission.csv')

Go to platform. Participate in the challenge and submit the submission.csv.