Many corrupted images in the train dataset

#1

for example:

train2/class-771/48a81c7c593b4e8866935480417fa63e.jpg
train2/class-697/14051479c062bf6e566a65403f958c66.jpg
train2/class-78/81ca3cbe226578649165a94cbec278c6.jpg
train2/class-4/951f55af6c9481013663d15544b0cfc8.jpg

I deleted around 100 images and there are still more…

#2

Please check this link in the forum (you may have to scroll all the way up) - to remove corrupt images in dataset.

Best of luck!
Santhosh

#3

I used another solution:

find . -name "*.jpg" -size -16k -delete

just removing small/empty files.

1 Like
#4

I used fastai’s built in function : very smooth

https://docs.fast.ai/vision.data.html#verify_images

1 Like
#5

@santhosh_shetty I used fastai’s in built function but it is not able to find any corrupt image.

for c in data.classes:
    print(c)
    verify_images(path/c, delete=True)
#6

You can use the list of images specified above to delete the images