Labelling errors in the train set

Christopher_Kreis · May 25, 2021, 11:39am

Hello,
I tried to visualize the annotated boxes to get a better understanding of the dataset and think that I’ve found some labelling errors.
The relevant images are included below:

etienne_david · May 31, 2021, 8:53am

Hello,

There are two kind of errors:

The first two images were old export errors. Despite our vigilance and multiple reviews pass, it is possible to witness few of these mistakes. We will try to address them after the challenge.
The last two images are a bug that we will correct. If you look to the train.csv, these images appear on two rows instead of one. The second row contains the right label. We will upload today or tomorrow an amended version of the train dataset !

Thanks again to have spotted the mistakes !

Best,

Etienne

ksnxr · June 1, 2021, 1:25pm

Looking forward to using the amended version on WILDS!

etienne_david · June 1, 2021, 3:07pm

Thanks and sorry again for the disconvenience. However, I don’t think it will hurt that much the final score !

ksnxr · June 6, 2021, 8:29am

Hi! It seems like data downloaded from the AI Crowd cli still have the labelling errors?

ksnxr · June 7, 2021, 9:52am

There are 2 repeated image names in train.csv and 1 repeated image name in submission.csv