Wrong data


I checked the dataset (DrawnUI2021: Screenshot) and I found some incorrect data with:

  1. blank input image (e.g. 6c955270-f1a3-11ea-935a-7b284523f48e_3.jpg, 6d0922e0-f1a3-11ea-935a-7b284523f48e_5.jpg)
  2. lots of wrong bounding boxes (e.g. 757d4a50-f1a3-11ea-935a-7b284523f48e.jpg, 4da2ba10-f171-11ea-9c01-ddde32c713c7.jpg, 4b678460-f1a3-11ea-935a-7b284523f48e.jpg) - Note: sometimes the width of the bounding box is outside the image

I randomly selected 300 training images and 61 of them was very terrible.

Can you please check the dataset?

Thank You & Best Regards!
Jiří Vyskočil

Post Scriptum:
I do not take questionable images into account! I mean the images with fewer annotated data (e.g. 3c0d76a0-f1a3-11ea-935a-7b284523f48e.jpg) or with bigger bounding boxes (3b27d190-f1a3-11ea-935a-7b284523f48e.jpg) than could be.

Analyzed data:


cc: @dimitri.fichou for inputs here

Thanks for the feedback and sorry for the late answer, it arrives in spams…
We expect wrong datapoints in the training set, 20% seems high but I am not surprised. We only rapidly screened the images in the training set.
Regarding bounding boxes out of image and blank input image, it should not be present, we will take a closer look and come back to you.

1 Like

Hi @vyskocj,
It seems we have more problems than expected and the data is truly wrong.
You are the only one who downloads it so far so I think it’s not too late to make the correction.
Can you wait until tomorrow for us to make more check on the dataset and reupload it with correction ?
I am very sorry for the inconvenience.