New Starter Notebook + paperspace

shraddhaa_mohan · March 25, 2020, 8:55am

Hey everyone,

We know that computing resources may be really difficult to come by, especially for beginners, so we have written a new starter notebook that allows you to train a MaskRCNN model directly on Colab.

Mask-RCNN Food Starter Code

This dataset and notebook correspond to the Food Recognition Challenge being held on AICrowd.

In this Notebook, we will first do an analysis of the Food Recognition Dataset and then use maskrcnn for training on the dataset.

The Challenge¶

Given Images of Food, we are asked to provide Instance Segmentation over the images for the food items.
The Training Data is provided in the COCO format, making it simpler to load with pre-available COCO data processors in popular libraries.
The test set provided in the public dataset is similar to Validation set, but with no annotations.
The test set after submission is much larger and contains private images upon which every submission is evaluated.
Pariticipants have to submit their trained model along with trained weights. Immediately after the submission the AICrowd Grader picks up the submitted model and produces inference on the private test set using Cloud GPUs.
This requires Users to structure their repositories and follow a provided paradigm for submission.
The AICrowd AutoGrader picks up the Dockerfile provided with the repository, builds it and then mounts the tests folder in the container. Once inference is made, the final results are checked with the ground truth.

For more submission related information, please check the AIcrowd Challenge page and the starter kit.

The Notebook¶

Installation of MaskRCNN

Using MatterPort MaskRCNN Library and Making local inference with it

Local Evaluation Using Matterport MaskRCNN

A bonus section on other resources to read is also added!

Dataset Download¶

Note: By downloading this data you are argeeing to the competition rules specified here

In [0]:

!wget -q https://s3.eu-central-1.wasabisys.com/aicrowd-public-datasets/myfoodrepo/round-2/train.tar.gz
!wget -q https://s3.eu-central-1.wasabisys.com/aicrowd-public-datasets/myfoodrepo/round-2/val.tar.gz

In [0]:

!mkdir data
!mkdir data/val
!mkdir data/train
!tar -xf train.tar.gz -C data/train
!tar -xf val.tar.gz -C data/val

Installation¶

In [0]:

#Directories present
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
for dirname, _, filenames in os.walk('data/'):
        print(dirname)

data/
data/train
data/train/images
data/val
data/val/images
data/val/test_images
data/val/test_images/images

In [0]:

import warnings
warnings.filterwarnings("ignore")

In [0]:

pip install -q -U numpy==1.16.1

In [0]:

import os 
import sys
import random
import math
import numpy as np
import cv2
import matplotlib.pyplot as plt
import json
from imgaug import augmenters as iaa
from tqdm import tqdm
import pandas as pd 
import glob

In [0]:

!pip install -q tensorflow-gpu==1.13.1

In [0]:

import tensorflow as tf
tf.__version__

Out[0]:

'1.15.0'

In [0]:

DATA_DIR = 'data'
# Directory to save logs and trained model
ROOT_DIR = 'working'

In [0]:

!git clone https://www.github.com/matterport/Mask_RCNN.git
os.chdir('Mask_RCNN')
!pip install -q -r requirements.txt
!python setup.py -q install

In [0]:

# Import Mask RCNN
sys.path.append(os.path.join('.', 'Mask_RCNN'))  # To find local version of the library
from mrcnn.config import Config
from mrcnn import utils
import mrcnn.model as modellib
from mrcnn import visualize
from mrcnn.model import log

Using TensorFlow backend.

In [0]:

!pip uninstall pycocotools -y
!pip install -q git+https://github.com/waleedka/coco.git#subdirectory=PythonAPI

In [0]:

from mrcnn import utils
import numpy as np

from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
from pycocotools import mask as maskUtils

MaskRCNN¶

To train MaskRCNN, two things we have to define FoodChallengeDataset that implements the Dataset class of MaskRCNN and FoodChallengeConfig that implements the Config class.

The FoodChallengeDataset helps define certain functions that allow us to load the data.

The FoodChallengeConfig gives the information like NUM_CLASSES, BACKBONE, etc.

In [0]:

class FoodChallengeDataset(utils.Dataset):
    def load_dataset(self, dataset_dir, load_small=False, return_coco=True):
        """ Loads dataset released for the AICrowd Food Challenge
            Params:
                - dataset_dir : root directory of the dataset (can point to the train/val folder)
                - load_small : Boolean value which signals if the annotations for all the images need to be loaded into the memory,
                               or if only a small subset of the same should be loaded into memory
        """
        self.load_small = load_small
        if self.load_small:
            annotation_path = os.path.join(dataset_dir, "annotation-small.json")
        else:
            annotation_path = os.path.join(dataset_dir, "annotations.json")

        image_dir = os.path.join(dataset_dir, "images")
        print("Annotation Path ", annotation_path)
        print("Image Dir ", image_dir)
        assert os.path.exists(annotation_path) and os.path.exists(image_dir)

        self.coco = COCO(annotation_path)
        self.image_dir = image_dir

        # Load all classes (Only Building in this version)
        classIds = self.coco.getCatIds()

        # Load all images
        image_ids = list(self.coco.imgs.keys())

        # register classes
        for _class_id in classIds:
            self.add_class("crowdai-food-challenge", _class_id, self.coco.loadCats(_class_id)[0]["name"])

        # Register Images
        for _img_id in image_ids:
            assert(os.path.exists(os.path.join(image_dir, self.coco.imgs[_img_id]['file_name'])))
            self.add_image(
                "crowdai-food-challenge", image_id=_img_id,
                path=os.path.join(image_dir, self.coco.imgs[_img_id]['file_name']),
                width=self.coco.imgs[_img_id]["width"],
                height=self.coco.imgs[_img_id]["height"],
                annotations=self.coco.loadAnns(self.coco.getAnnIds(
                                            imgIds=[_img_id],
                                            catIds=classIds,
                                            iscrowd=None)))

        if return_coco:
            return self.coco

    def load_mask(self, image_id):
        """ Loads instance mask for a given image
              This function converts mask from the coco format to a
              a bitmap [height, width, instance]
            Params:
                - image_id : reference id for a given image

            Returns:
                masks : A bool array of shape [height, width, instances] with
                    one mask per instance
                class_ids : a 1D array of classIds of the corresponding instance masks
                    (In this version of the challenge it will be of shape [instances] and always be filled with the class-id of the "Building" class.)
        """

        image_info = self.image_info[image_id]
        assert image_info["source"] == "crowdai-food-challenge"

        instance_masks = []
        class_ids = []
        annotations = self.image_info[image_id]["annotations"]
        # Build mask of shape [height, width, instance_count] and list
        # of class IDs that correspond to each channel of the mask.
        for annotation in annotations:
            class_id = self.map_source_class_id(
                "crowdai-food-challenge.{}".format(annotation['category_id']))
            if class_id:
                m = self.annToMask(annotation,  image_info["height"],
                                                image_info["width"])
                # Some objects are so small that they're less than 1 pixel area
                # and end up rounded out. Skip those objects.
                if m.max() < 1:
                    continue

                # Ignore the notion of "is_crowd" as specified in the coco format
                # as we donot have the said annotation in the current version of the dataset

                instance_masks.append(m)
                class_ids.append(class_id)
        # Pack instance masks into an array
        if class_ids:
            mask = np.stack(instance_masks, axis=2)
            class_ids = np.array(class_ids, dtype=np.int32)
            return mask, class_ids
        else:
            # Call super class to return an empty mask
            return super(FoodChallengeDataset, self).load_mask(image_id)


    def image_reference(self, image_id):
        """Return a reference for a particular image

            Ideally you this function is supposed to return a URL
            but in this case, we will simply return the image_id
        """
        return "crowdai-food-challenge::{}".format(image_id)
    # The following two functions are from pycocotools with a few changes.

    def annToRLE(self, ann, height, width):
        """
        Convert annotation which can be polygons, uncompressed RLE to RLE.
        :return: binary mask (numpy 2D array)
        """
        segm = ann['segmentation']
        if isinstance(segm, list):
            # polygon -- a single object might consist of multiple parts
            # we merge all parts into one mask rle code
            rles = maskUtils.frPyObjects(segm, height, width)
            rle = maskUtils.merge(rles)
        elif isinstance(segm['counts'], list):
            # uncompressed RLE
            rle = maskUtils.frPyObjects(segm, height, width)
        else:
            # rle
            rle = ann['segmentation']
        return rle

    def annToMask(self, ann, height, width):
        """
        Convert annotation which can be polygons, uncompressed RLE, or RLE to binary mask.
        :return: binary mask (numpy 2D array)
        """
        rle = self.annToRLE(ann, height, width)
        m = maskUtils.decode(rle)
        return m

In [0]:

class FoodChallengeConfig(Config):
    """Configuration for training on data in MS COCO format.
    Derives from the base Config class and overrides values specific
    to the COCO dataset.
    """
    # Give the configuration a recognizable name
    NAME = "crowdai-food-challenge"

    # We use a GPU with 12GB memory, which can fit two images.
    # Adjust down if you use a smaller GPU.
    IMAGES_PER_GPU = 4

    # Uncomment to train on 8 GPUs (default is 1)
    GPU_COUNT = 1
    BACKBONE = 'resnet50'
    # Number of classes (including background)
    NUM_CLASSES = 62  # 1 Background + 61 classes

    STEPS_PER_EPOCH=150
    VALIDATION_STEPS=50

    LEARNING_RATE=0.001
    IMAGE_MAX_DIM=256
    IMAGE_MIN_DIM=256

In [0]:

config = FoodChallengeConfig()
config.display()

Configurations:
BACKBONE                       resnet50
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     4
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        100
DETECTION_MIN_CONFIDENCE       0.7
DETECTION_NMS_THRESHOLD        0.3
FPN_CLASSIF_FC_LAYERS_SIZE     1024
GPU_COUNT                      1
GRADIENT_CLIP_NORM             5.0
IMAGES_PER_GPU                 4
IMAGE_CHANNEL_COUNT            3
IMAGE_MAX_DIM                  256
IMAGE_META_SIZE                74
IMAGE_MIN_DIM                  256
IMAGE_MIN_SCALE                0
IMAGE_RESIZE_MODE              square
IMAGE_SHAPE                    [256 256   3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
LOSS_WEIGHTS                   {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               100
MEAN_PIXEL                     [123.7 116.8 103.9]
MINI_MASK_SHAPE                (56, 56)
NAME                           crowdai-food-challenge
NUM_CLASSES                    62
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
PRE_NMS_LIMIT                  6000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.7
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                150
TOP_DOWN_PYRAMID_SIZE          256
TRAIN_BN                       False
TRAIN_ROIS_PER_IMAGE           200
USE_MINI_MASK                  True
USE_RPN_ROIS                   True
VALIDATION_STEPS               50
WEIGHT_DECAY                   0.0001

You can change other values in the FoodChallengeConfig as well and try out different combinations for best results!

In [0]:

!mkdir pretrained

In [0]:

PRETRAINED_MODEL_PATH = os.path.join("pretrained", "mask_rcnn_coco.h5")
LOGS_DIRECTORY = os.path.join(ROOT_DIR, "logs")

In [0]:

if not os.path.exists(PRETRAINED_MODEL_PATH):
    utils.download_trained_weights(PRETRAINED_MODEL_PATH)

Downloading pretrained model to pretrained/mask_rcnn_coco.h5 ...
... done downloading pretrained model!

In [0]:

from keras import backend as K
K.tensorflow_backend._get_available_gpus()

In [0]:

import keras.backend
K = keras.backend.backend()
if K=='tensorflow':
    keras.backend.common.image_dim_ordering()
model = modellib.MaskRCNN(mode="training", config=config, model_dir=LOGS_DIRECTORY)
model_path = PRETRAINED_MODEL_PATH
model.load_weights(model_path, by_name=True, exclude=[
        "mrcnn_class_logits", "mrcnn_bbox_fc",
        "mrcnn_bbox", "mrcnn_mask"])

In [0]:

dataset_train = FoodChallengeDataset()
dataset_train.load_dataset('/content/data/train', load_small=False)
dataset_train.prepare()

Annotation Path  /content/data/train/annotations.json
Image Dir  /content/data/train/images
loading annotations into memory...
Done (t=0.58s)
creating index...
index created!

In [0]:

dataset_val = FoodChallengeDataset()
val_coco = dataset_val.load_dataset(dataset_dir='/content/data/val', load_small=False, return_coco=True)
dataset_val.prepare()

Annotation Path  /content/data/val/annotations.json
Image Dir  /content/data/val/images
loading annotations into memory...
Done (t=0.03s)
creating index...
index created!

In [0]:

class_names = dataset_train.class_names
# If you don't have the correct classes here, there must be some error in your DatasetConfig
assert len(class_names)==62, "Please check DatasetConfig"
class_names

Lets start training!!¶

In [0]:

print("Training network")
model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE,
            epochs=15,
            layers='heads')

Training network

Starting at epoch 0. LR=0.001

Checkpoint Path: working/logs/crowdai-food-challenge20200325T1633/mask_rcnn_crowdai-food-challenge_{epoch:04d}.h5
Selecting layers to train
fpn_c5p5               (Conv2D)
fpn_c4p4               (Conv2D)
fpn_c3p3               (Conv2D)
fpn_c2p2               (Conv2D)
fpn_p5                 (Conv2D)
fpn_p2                 (Conv2D)
fpn_p3                 (Conv2D)
fpn_p4                 (Conv2D)
In model:  rpn_model
    rpn_conv_shared        (Conv2D)
    rpn_class_raw          (Conv2D)
    rpn_bbox_pred          (Conv2D)
mrcnn_mask_conv1       (TimeDistributed)
mrcnn_mask_bn1         (TimeDistributed)
mrcnn_mask_conv2       (TimeDistributed)
mrcnn_mask_bn2         (TimeDistributed)
mrcnn_class_conv1      (TimeDistributed)
mrcnn_class_bn1        (TimeDistributed)
mrcnn_mask_conv3       (TimeDistributed)
mrcnn_mask_bn3         (TimeDistributed)
mrcnn_class_conv2      (TimeDistributed)
mrcnn_class_bn2        (TimeDistributed)
mrcnn_mask_conv4       (TimeDistributed)
mrcnn_mask_bn4         (TimeDistributed)
mrcnn_bbox_fc          (TimeDistributed)
mrcnn_mask_deconv      (TimeDistributed)
mrcnn_class_logits     (TimeDistributed)
mrcnn_mask             (TimeDistributed)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:793: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1033: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1020: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/callbacks.py:1122: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/callbacks.py:1125: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

Epoch 1/15
150/150 [==============================] - 250s 2s/step - loss: 2.4813 - rpn_class_loss: 0.0334 - rpn_bbox_loss: 0.5775 - mrcnn_class_loss: 0.4669 - mrcnn_bbox_loss: 0.6819 - mrcnn_mask_loss: 0.7217 - val_loss: 2.2839 - val_rpn_class_loss: 0.0288 - val_rpn_bbox_loss: 0.6724 - val_mrcnn_class_loss: 0.2742 - val_mrcnn_bbox_loss: 0.6156 - val_mrcnn_mask_loss: 0.6929
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/callbacks.py:1265: The name tf.Summary is deprecated. Please use tf.compat.v1.Summary instead.

Epoch 2/15
150/150 [==============================] - 198s 1s/step - loss: 2.1028 - rpn_class_loss: 0.0300 - rpn_bbox_loss: 0.5362 - mrcnn_class_loss: 0.3526 - mrcnn_bbox_loss: 0.5183 - mrcnn_mask_loss: 0.6656 - val_loss: 2.2352 - val_rpn_class_loss: 0.0290 - val_rpn_bbox_loss: 0.6337 - val_mrcnn_class_loss: 0.3400 - val_mrcnn_bbox_loss: 0.5495 - val_mrcnn_mask_loss: 0.6830
Epoch 3/15
150/150 [==============================] - 198s 1s/step - loss: 2.0055 - rpn_class_loss: 0.0277 - rpn_bbox_loss: 0.5944 - mrcnn_class_loss: 0.2881 - mrcnn_bbox_loss: 0.4785 - mrcnn_mask_loss: 0.6168 - val_loss: 2.2014 - val_rpn_class_loss: 0.0343 - val_rpn_bbox_loss: 0.6598 - val_mrcnn_class_loss: 0.3873 - val_mrcnn_bbox_loss: 0.5100 - val_mrcnn_mask_loss: 0.6101
Epoch 4/15
150/150 [==============================] - 198s 1s/step - loss: 1.8439 - rpn_class_loss: 0.0282 - rpn_bbox_loss: 0.4459 - mrcnn_class_loss: 0.3620 - mrcnn_bbox_loss: 0.4506 - mrcnn_mask_loss: 0.5572 - val_loss: 1.7891 - val_rpn_class_loss: 0.0200 - val_rpn_bbox_loss: 0.3661 - val_mrcnn_class_loss: 0.3545 - val_mrcnn_bbox_loss: 0.5030 - val_mrcnn_mask_loss: 0.5455
Epoch 5/15
150/150 [==============================] - 198s 1s/step - loss: 1.8198 - rpn_class_loss: 0.0255 - rpn_bbox_loss: 0.5154 - mrcnn_class_loss: 0.3380 - mrcnn_bbox_loss: 0.4415 - mrcnn_mask_loss: 0.4993 - val_loss: 2.1075 - val_rpn_class_loss: 0.0373 - val_rpn_bbox_loss: 0.8661 - val_mrcnn_class_loss: 0.2930 - val_mrcnn_bbox_loss: 0.4375 - val_mrcnn_mask_loss: 0.4737
Epoch 6/15
150/150 [==============================] - 199s 1s/step - loss: 1.6236 - rpn_class_loss: 0.0246 - rpn_bbox_loss: 0.4319 - mrcnn_class_loss: 0.3052 - mrcnn_bbox_loss: 0.4136 - mrcnn_mask_loss: 0.4483 - val_loss: 1.7855 - val_rpn_class_loss: 0.0283 - val_rpn_bbox_loss: 0.4782 - val_mrcnn_class_loss: 0.3782 - val_mrcnn_bbox_loss: 0.4561 - val_mrcnn_mask_loss: 0.4447
Epoch 7/15
150/150 [==============================] - 198s 1s/step - loss: 1.6660 - rpn_class_loss: 0.0297 - rpn_bbox_loss: 0.3727 - mrcnn_class_loss: 0.4151 - mrcnn_bbox_loss: 0.4138 - mrcnn_mask_loss: 0.4347 - val_loss: 1.5738 - val_rpn_class_loss: 0.0204 - val_rpn_bbox_loss: 0.3539 - val_mrcnn_class_loss: 0.2837 - val_mrcnn_bbox_loss: 0.4517 - val_mrcnn_mask_loss: 0.4641
Epoch 8/15
150/150 [==============================] - 198s 1s/step - loss: 1.6270 - rpn_class_loss: 0.0264 - rpn_bbox_loss: 0.4118 - mrcnn_class_loss: 0.3484 - mrcnn_bbox_loss: 0.4087 - mrcnn_mask_loss: 0.4316 - val_loss: 1.4721 - val_rpn_class_loss: 0.0207 - val_rpn_bbox_loss: 0.4436 - val_mrcnn_class_loss: 0.2589 - val_mrcnn_bbox_loss: 0.3689 - val_mrcnn_mask_loss: 0.3800
Epoch 9/15
150/150 [==============================] - 198s 1s/step - loss: 1.5445 - rpn_class_loss: 0.0251 - rpn_bbox_loss: 0.3952 - mrcnn_class_loss: 0.3573 - mrcnn_bbox_loss: 0.3754 - mrcnn_mask_loss: 0.3914 - val_loss: 1.6586 - val_rpn_class_loss: 0.0279 - val_rpn_bbox_loss: 0.5210 - val_mrcnn_class_loss: 0.3095 - val_mrcnn_bbox_loss: 0.3879 - val_mrcnn_mask_loss: 0.4124
Epoch 10/15
150/150 [==============================] - 198s 1s/step - loss: 1.4816 - rpn_class_loss: 0.0256 - rpn_bbox_loss: 0.3714 - mrcnn_class_loss: 0.3485 - mrcnn_bbox_loss: 0.3516 - mrcnn_mask_loss: 0.3845 - val_loss: 1.6323 - val_rpn_class_loss: 0.0249 - val_rpn_bbox_loss: 0.5036 - val_mrcnn_class_loss: 0.3481 - val_mrcnn_bbox_loss: 0.3780 - val_mrcnn_mask_loss: 0.3778
Epoch 11/15
150/150 [==============================] - 198s 1s/step - loss: 1.5683 - rpn_class_loss: 0.0324 - rpn_bbox_loss: 0.4423 - mrcnn_class_loss: 0.3544 - mrcnn_bbox_loss: 0.3568 - mrcnn_mask_loss: 0.3825 - val_loss: 1.5773 - val_rpn_class_loss: 0.0229 - val_rpn_bbox_loss: 0.4872 - val_mrcnn_class_loss: 0.3083 - val_mrcnn_bbox_loss: 0.3959 - val_mrcnn_mask_loss: 0.3630
Epoch 12/15
150/150 [==============================] - 197s 1s/step - loss: 1.5057 - rpn_class_loss: 0.0286 - rpn_bbox_loss: 0.4027 - mrcnn_class_loss: 0.3458 - mrcnn_bbox_loss: 0.3623 - mrcnn_mask_loss: 0.3663 - val_loss: 1.5191 - val_rpn_class_loss: 0.0188 - val_rpn_bbox_loss: 0.3394 - val_mrcnn_class_loss: 0.3640 - val_mrcnn_bbox_loss: 0.3897 - val_mrcnn_mask_loss: 0.4072
Epoch 13/15
150/150 [==============================] - 198s 1s/step - loss: 1.4136 - rpn_class_loss: 0.0201 - rpn_bbox_loss: 0.3566 - mrcnn_class_loss: 0.3124 - mrcnn_bbox_loss: 0.3520 - mrcnn_mask_loss: 0.3725 - val_loss: 1.3404 - val_rpn_class_loss: 0.0210 - val_rpn_bbox_loss: 0.2611 - val_mrcnn_class_loss: 0.3688 - val_mrcnn_bbox_loss: 0.3533 - val_mrcnn_mask_loss: 0.3361
Epoch 14/15
150/150 [==============================] - 198s 1s/step - loss: 1.4214 - rpn_class_loss: 0.0246 - rpn_bbox_loss: 0.3375 - mrcnn_class_loss: 0.3353 - mrcnn_bbox_loss: 0.3523 - mrcnn_mask_loss: 0.3718 - val_loss: 1.5430 - val_rpn_class_loss: 0.0229 - val_rpn_bbox_loss: 0.3213 - val_mrcnn_class_loss: 0.4066 - val_mrcnn_bbox_loss: 0.3941 - val_mrcnn_mask_loss: 0.3981
Epoch 15/15
150/150 [==============================] - 197s 1s/step - loss: 1.3920 - rpn_class_loss: 0.0214 - rpn_bbox_loss: 0.3561 - mrcnn_class_loss: 0.3051 - mrcnn_bbox_loss: 0.3515 - mrcnn_mask_loss: 0.3578 - val_loss: 1.5167 - val_rpn_class_loss: 0.0251 - val_rpn_bbox_loss: 0.5400 - val_mrcnn_class_loss: 0.2456 - val_mrcnn_bbox_loss: 0.3554 - val_mrcnn_mask_loss: 0.3505

In [0]:

model_path = model.find_last()
model_path

Out[0]:

'working/logs/crowdai-food-challenge20200325T1633/mask_rcnn_crowdai-food-challenge_0015.h5'

In [0]:

class InferenceConfig(FoodChallengeConfig):
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1
    NUM_CLASSES = 62  # 1 Background + 61 classes
    IMAGE_MAX_DIM=256
    IMAGE_MIN_DIM=256
    NAME = "food"
    DETECTION_MIN_CONFIDENCE=0

inference_config = InferenceConfig()
inference_config.display()

Configurations:
BACKBONE                       resnet50
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     1
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        100
DETECTION_MIN_CONFIDENCE       0
DETECTION_NMS_THRESHOLD        0.3
FPN_CLASSIF_FC_LAYERS_SIZE     1024
GPU_COUNT                      1
GRADIENT_CLIP_NORM             5.0
IMAGES_PER_GPU                 1
IMAGE_CHANNEL_COUNT            3
IMAGE_MAX_DIM                  256
IMAGE_META_SIZE                74
IMAGE_MIN_DIM                  256
IMAGE_MIN_SCALE                0
IMAGE_RESIZE_MODE              square
IMAGE_SHAPE                    [256 256   3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
LOSS_WEIGHTS                   {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               100
MEAN_PIXEL                     [123.7 116.8 103.9]
MINI_MASK_SHAPE                (56, 56)
NAME                           food
NUM_CLASSES                    62
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
PRE_NMS_LIMIT                  6000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.7
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                150
TOP_DOWN_PYRAMID_SIZE          256
TRAIN_BN                       False
TRAIN_ROIS_PER_IMAGE           200
USE_MINI_MASK                  True
USE_RPN_ROIS                   True
VALIDATION_STEPS               50
WEIGHT_DECAY                   0.0001

In [0]:

# Recreate the model in inference mode
model = modellib.MaskRCNN(mode='inference', 
                          config=inference_config,
                          model_dir=ROOT_DIR)

# Load trained weights (fill in path to trained weights here)
assert model_path != "", "Provide path to trained weights"
print("Loading weights from ", model_path)
model.load_weights(model_path, by_name=True)

In [0]:

# Show few example of ground truth vs. predictions on the validation dataset 
dataset = dataset_val
fig = plt.figure(figsize=(10, 30))

for i in range(4):

    image_id = random.choice(dataset.image_ids)
    
    original_image, image_meta, gt_class_id, gt_bbox, gt_mask =\
        modellib.load_image_gt(dataset_val, inference_config, 
                               image_id, use_mini_mask=False)
    
    print(original_image.shape)
    plt.subplot(6, 2, 2*i + 1)
    visualize.display_instances(original_image, gt_bbox, gt_mask, gt_class_id, 
                                dataset.class_names, ax=fig.axes[-1])
    
    plt.subplot(6, 2, 2*i + 2)
    results = model.detect([original_image]) #, verbose=1)
    r = results[0]
    visualize.display_instances(original_image, r['rois'], r['masks'], r['class_ids'], 
                                dataset.class_names, r['scores'], ax=fig.axes[-1])

(256, 256, 3)
(256, 256, 3)
(256, 256, 3)

*** No instances to display *** 

(256, 256, 3)

In [0]:

import json
with open('/content/data/val/annotations.json') as json_file:
    data = json.load(json_file)

In [0]:

d = {}
for x in data["categories"]:
    d[x["name"]]=x["id"]

In [0]:

id_category = [0]
for x in dataset.class_names[1:]:
    id_category.append(d[x])
#id_category

In [0]:

import tqdm
import skimage

In [0]:

files = glob.glob(os.path.join('/content/data/val/test_images/images', "*.jpg"))
_final_object = []
for file in tqdm.tqdm(files):
    images = [skimage.io.imread(file) ]
    #if(len(images)!= inference_config.IMAGES_PER_GPU):
    #    images = images + [images[-1]]*(inference_config.BATCH_SIZE - len(images))
    predictions = model.detect(images, verbose=0)
    #print(file)
    for _idx, r in enumerate(predictions):
        
            image_id = int(file.split("/")[-1].replace(".jpg",""))
            for _idx, class_id in enumerate(r["class_ids"]):
                if class_id > 0:
                    mask = r["masks"].astype(np.uint8)[:, :, _idx]
                    bbox = np.around(r["rois"][_idx], 1)
                    bbox = [float(x) for x in bbox]
                    _result = {}
                    _result["image_id"] = image_id
                    _result["category_id"] = id_category[class_id]
                    _result["score"] = float(r["scores"][_idx])
                    _mask = maskUtils.encode(np.asfortranarray(mask))
                    _mask["counts"] = _mask["counts"].decode("UTF-8")
                    _result["segmentation"] = _mask
                    _result["bbox"] = [bbox[1], bbox[0], bbox[3] - bbox[1], bbox[2] - bbox[0]]
                    _final_object.append(_result)

fp = open('/content/output.json', "w")
import json
print("Writing JSON...")
fp.write(json.dumps(_final_object))
fp.close()

100%|██████████| 418/418 [01:08<00:00,  6.08it/s]

Writing JSON...

In [0]:

submission_file = json.loads(open("/content/output.json").read())
len(submission_file)

In [0]:

type(submission_file)

In [0]:

import random
import json
import numpy as np
import argparse
import base64
import glob
import os
from PIL import Image

from pycocotools.coco import COCO
GROUND_TRUTH_ANNOTATION_PATH = "/content/data/val/annotations.json"
ground_truth_annotations = COCO(GROUND_TRUTH_ANNOTATION_PATH)
submission_file = json.loads(open("/content/output.json").read())
results = ground_truth_annotations.loadRes(submission_file)
cocoEval = COCOeval(ground_truth_annotations, results, 'segm')
cocoEval.evaluate()
cocoEval.accumulate()
cocoEval.summarize()

loading annotations into memory...
Done (t=0.03s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *segm*
DONE (t=0.52s).
Accumulating evaluation results...
DONE (t=0.22s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.059
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.096
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.062
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.020
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.061
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.094
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.094
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.094
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.034
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.097

BONUS : Resources to Read¶

In [0]:

This, in addition to the existing Mask RCNN baseline repo should allow you to plug and play models for easy submission and experimentation.

As an alternative to Colab, Paperspace is an amazing option. Their gradient community notebooks let you use free cloud GPU’s and CPU’s and also provide internal storage that let you save models and resume training after the deployment time expires.

Paperspace

Regards
AICrowd Team