Baseline - JIGSAW

ashivani · October 6, 2020, 8:42am

Notebook

Getting Started Code for JIGSAW Challenge on AIcrowd¶

Author : Sharada Mohanty¶

This baseline creates the image of desired size and places the puzzle pieces randomly.

Download Necessary Packages 📚¶

In [ ]:

!pip install numpy
!pip install pandas

Download Data¶

The first step is to download out train test data. We will be training a model on the train data and make predictions on test data. We submit our predictions.

In [ ]:

!rm -rf data
!mkdir data
!wget https://datasets.aicrowd.com/default/aicrowd-practice-challenges/public/jigsaw/v0.1/puzzles.tar.gz
!wget https://datasets.aicrowd.com/default/aicrowd-practice-challenges/public/jigsaw/v0.1/metadata.csv

!mkdir data/puzzles   
!tar -C data/puzzles -xvzf puzzles.tar.gz 

!mv metadata.csv data/metadata.csv

Import packages¶

In [ ]:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from PIL import Image
import glob
import tempfile
import os 
import random
import tqdm
import tarfile
%matplotlib inline

Load Data¶

In [ ]:

PUZZLES_DIRECTORY = "data/puzzles"
METADATA_FILE = "data/metadata.csv"

OUTPUT_PATH = "data/submission.tar.gz"
metadata_df = pd.read_csv(METADATA_FILE)

Create directory to store the solved image.

In [ ]:

TEMP_SUBMISSION_DIR = tempfile.TemporaryDirectory()
TEMP_SUBMISSION_DIR_PATH = TEMP_SUBMISSION_DIR.name

This is a very naive approach of creating a new image of the desired size, and paste all the individual puzzle pieces at random locations.

In [ ]:

for index, row in tqdm.tqdm(metadata_df.iterrows(), total=metadata_df.shape[0]):
    # Get the height and widhts of all the images from metadata.csv.
    

    puzzle_id = row["puzzle_id"]
    image_width = row["width"]
    image_height = row["height"]

    puzzle_directory = os.path.join(
        PUZZLES_DIRECTORY,
        str(puzzle_id)
    )
    solved_puzzle_im = Image.new("RGBA",
        (image_width, image_height)
    ) # Initially create RGBA images, and then drop A channel later

    for _puzzle_piece in glob.glob(os.path.join(puzzle_directory, "*.png")):
        puzzle_piece_im = Image.open(_puzzle_piece)
        pp_width, pp_height = puzzle_piece_im.size

        # Find Random location 
        random_x = random.randint(0, image_width - pp_width)
        random_y = random.randint(0, image_height - pp_height)

        solved_puzzle_im.paste(puzzle_piece_im, (random_x, random_y))

        del puzzle_piece_im

    solved_puzzle_im.convert("RGB").save(
        os.path.join(TEMP_SUBMISSION_DIR_PATH, "{}.jpg".format(str(puzzle_id)))
    )
    del solved_puzzle_im

Visualize the generated Image¶

In [ ]:

img_path  = os.path.join(TEMP_SUBMISSION_DIR_PATH,'2.jpg')
img = Image.open(img_path)
img

Create the tar file with all the solved images and store it in the OUTPUT PATH.¶

In [ ]:

with tarfile.open(OUTPUT_PATH, mode="w:gz") as tar_file:
    for _filepath in glob.glob(os.path.join(TEMP_SUBMISSION_DIR_PATH, "*.jpg")):
        print(_filepath)
        _, filename = os.path.split(_filepath)
        tar_file.add(_filepath, arcname=filename)

print("Wrote output file to : ", OUTPUT_PATH)

To download the generated tar.gz in colab run the below command¶

In [ ]:

try:
    from google.colab import files
    files.download(OUTPUT_PATH) 
except:
    print("Option Only avilable in Google Colab")

Well Done! 👍 We are all set to make a submission and see your name on leaderborad. Lets navigate to challenge page and make one.¶

In [ ]: