Getting Started Code for SOUSEN Challenge on AIcrowd¶
Author : Shubham Gupta¶
Installing AIcrowd CLI and Authentication¶
This will help in easy downloading dataset and submitting directly via this notebook. Do not forget to participate and accept the rules before ruinning this notebook.
In [ ]:
!pip install git+https://gitlab.aicrowd.com/yoogottamk/aicrowd-cli.git
API_KEY = "" #Input your API key here, you can get it from your profile page.
!aicrowd login --api-key $API_KEY
Downloading the dataset¶
In [ ]:
!aicrowd dataset download -c sousen >/dev/null
In [ ]:
In [ ]:
!unzip train.zip
!unzip val.zip
!unzip test.zip
Download Necessary Packages 📚¶
In [ ]:
!pip install --upgrade fastai
!pip install librosa
Import packages¶
In [ ]:
# import fastai
import librosa
import pandas as pd
import cv2
import numpy as np
from fastai import *
from fastai.vision import *
from fastai.vision.data import *
from fastai.vision.all import *
In [ ]:
train_df = pd.read_csv("train.csv")[:50]
train_df.head()
Preprocessing Data¶
Instead of using the audio signal directly we have converted audio signals to image and used those images to train a convolutional neural network. This is just an simple approach, you are encouraged to try different preprocessing techniques.
In [ ]:
training_sound_paths = os.listdir("train")[:50]
train_df['name'] = "./spec_imgs/"+train_df['wav_id'].astype(str)+".jpg"
In [ ]:
os.mkdir("spec_imgs")
def mono_to_color(X, mean=None, std=None, norm_max=None, norm_min=None, eps=1e-6):
# Standardize
mean = mean or X.mean()
X = X - mean
std = std or X.std()
Xstd = X / (std + eps)
_min, _max = Xstd.min(), Xstd.max()
norm_max = norm_max or _max
norm_min = norm_min or _min
if (_max - _min) > eps:
# Normalize to [0, 255]
V = Xstd
V[V < norm_min] = norm_min
V[V > norm_max] = norm_max
V = 255 * (V - norm_min) / (norm_max - norm_min)
V = V.astype(np.uint8)
else:
# Just zeroPrepare a CSV containing wav_id and predicted value as digit [0-2] respectively denoting positive, neutral, and negative reviews.
V = np.zeros_like(Xstd, dtype=np.uint8)
return V
for n, sound_path in enumerate(training_sound_paths):
y, sr = librosa.load(f"train/{sound_path}")
total_secs = y.shape[0] / sr
M = librosa.feature.melspectrogram(y=y, sr=sr)
M = librosa.power_to_db(M)
M = mono_to_color(M)
# Prepare a CSV containing wav_id and predicted value as digit [0-2] respectively denoting positive, neutral, and negative reviews.
cv2.imwrite(f"./spec_imgs/{n}.jpg", M, [int(cv2.IMWRITE_JPEG_QUALITY), 85])
In [ ]:
train_df.label = train_df.label.astype(str)
train_df.head(1)
In [ ]:
dls = ImageDataLoaders.from_df(train_df,path='.', label_col=0, fn_col=2, item_tfms=Resize(224), bs=5, num_workers=0)
In [ ]:
dls.show_batch()
Training Model¶
In [ ]:
learn = cnn_learner(dls, resnet34, metrics=accuracy)
learn.fine_tune(1)
In [ ]:
learn.show_results()
Generating predictions¶
In [ ]:
os.makedirs("test_spec_imgs", exist_ok=True)
test_sound_paths = os.listdir("test")
test_img_paths = []
for n, sound_path in enumerate(test_sound_paths):
y, sr = librosa.load(f"test/{sound_path}")
total_secs = y.shape[0] / sr
M = librosa.feature.melspectrogram(y=y, sr=sr)
M = librosa.power_to_db(M)
M = mono_to_color(M)
cv2.imwrite(f"./test_spec_imgs/{n}.jpg", M, [int(cv2.IMWRITE_JPEG_QUALITY), 85])
test_img_paths.append(f"./test_spec_imgs/{n}.jpg")
In [ ]:
test_predictions = []
for test_img_path in test_img_paths:
prediction = learn.predict(test_img_path)
test_predictions.append(int(prediction[0][0]))
Creating the Submisison File¶
In [ ]:
test_img_paths = [int(i.split(".")[-2].split("/")[-1]) for i in test_img_paths]
submission = pd.DataFrame({"wav_id":test_img_paths, "label":test_predictions})
submission
submission.to_csv("submission.csv", index=False)
To download the generated csv in colab run the below command¶
In [ ]:
try:
from google.colab import files
files.download('submission.csv')
except:
print("Option Only avilable in Google Colab")
Well Done! 👍 We are all set to make a submission and see your name on leaderborad. Let navigate to challenge page and make one.¶
In [ ]: