Mistakes with the annotations

I have noticed a few mistakes with bboxes of the data. Is this how it is meant to be?

Example: Image from train set.
{‘id’: 6545, ‘file_name’: ‘006545.jpg’, ‘width’: 426, ‘height’: 426}

You can visualize the first 40 images here.
Above example is index = 29.
https://app.activeloop.ai/sainikhileshreddy/food-recognition-2022-train

yes I have seen similar problem with rotated on 90 degrees boxes.
Unfortunately dataset is far from ideal. Another example, when in ground truth labels we could observe some minor class only, however two major objects have not been labeled (but these classes are present in class list).
Nevertheless, such errors are not very frequent in dataset, and thus are not so important, so we could use it as is…

Thanks @Mykola_Lavreniuk for the response.

After digging for a while, I came to know the order of boxes was different.
Earlier I have thought the boxes were [x, y, width, height] but after some debugging, the boxes are in order of [y, x, height, width].

This is the output for the above image.

1 Like

Thanks, @SaiNikhileshReddy for sharing the example! :raised_hands:

@Mykola_Lavreniuk: Please let us know in case you are still facing any issues with the annotations! :mag:

I wonder if orientation exif data was accidentally stripped from these images. There are plenty for which the height and width in annotations.json are flipped from the actual images height and width. For example 056091.jpg

The annotation json says:

{'id': 56091, 'file_name': '056091.jpg', 'width': 3456, 'height': 4608}

However the unrotated image has a width of 4608 and height of 3456.

The exif data doesn’t include any orientation details

identify -format '%[orientation]' detectron_datasets/food/train/056091.jpg
Undefined

This is easy to correct for images which have differing width and height, however I cannot correct square images, as their dimensions don’t clue me in to whether they’re rotated. Additionally it’s unclear whether the rotation for rectangles is 90 degrees or 270 degrees.

@shivam do you know if a mistake was made resulting in the exif data being stripped in the dataset?

Further details

Other images which have opposite (height, width) of annotations.json:

{'id': 8617, 'file_name': '008617.jpg', 'width': 3024, 'height': 4032} 
{'id': 8619, 'file_name': '008619.jpg', 'width': 3024, 'height': 4032} 
{'id': 8626, 'file_name': '008626.jpg', 'width': 3024, 'height': 4032} 
{'id': 8620, 'file_name': '008620.jpg', 'width': 3024, 'height': 4032} 
{'id': 8817, 'file_name': '008817.jpg', 'width': 3024, 'height': 4032} 
{'id': 8869, 'file_name': '008869.jpg', 'width': 3024, 'height': 4032} 
{'id': 8919, 'file_name': '008919.jpg', 'width': 3024, 'height': 4032} 
{'id': 8934, 'file_name': '008934.jpg', 'width': 3024, 'height': 4032} 
{'id': 11967, 'file_name': '011967.jpg', 'width': 3024, 'height': 4032}
{'id': 12045, 'file_name': '012045.jpg', 'width': 3024, 'height': 4032}
{'id': 13989, 'file_name': '013989.jpg', 'width': 3024, 'height': 4032}
{'id': 17312, 'file_name': '017312.jpg', 'width': 2448, 'height': 3264}
{'id': 21923, 'file_name': '021923.jpg', 'width': 2322, 'height': 4128}
{'id': 23295, 'file_name': '023295.jpg', 'width': 2322, 'height': 4128}
{'id': 23296, 'file_name': '023296.jpg', 'width': 2322, 'height': 4128}
{'id': 23975, 'file_name': '023975.jpg', 'width': 2322, 'height': 4128}
{'id': 23976, 'file_name': '023976.jpg', 'width': 2322, 'height': 4128}
{'id': 8618, 'file_name': '008618.jpg', 'width': 3024, 'height': 4032} 
{'id': 8621, 'file_name': '008621.jpg', 'width': 3024, 'height': 4032} 
{'id': 8627, 'file_name': '008627.jpg', 'width': 3024, 'height': 4032} 
{'id': 8628, 'file_name': '008628.jpg', 'width': 3024, 'height': 4032} 
{'id': 8864, 'file_name': '008864.jpg', 'width': 3024, 'height': 4032} 
{'id': 49396, 'file_name': '049396.jpg', 'width': 3456, 'height': 4608} 
{'id': 50167, 'file_name': '050167.jpg', 'width': 1960, 'height': 4032} 
{'id': 53873, 'file_name': '053873.jpg', 'width': 3456, 'height': 4608} 
{'id': 53875, 'file_name': '053875.jpg', 'width': 3456, 'height': 4608} 
{'id': 53879, 'file_name': '053879.jpg', 'width': 3456, 'height': 4608} 
{'id': 56091, 'file_name': '056091.jpg', 'width': 3456, 'height': 4608}

(again note: this isn’t a comprehensive list of rotated images, as some rectangular images may be rotated 180 degrees, and square images aren’t detected)

  • Number of incorrect flipped width/height rectangular images: 28
  • Estimated number of rotated rectangular images considering that some may be 180 degrees and unaccounted for: 42
  • Total rectangular images in training set: 22903
  • Estimated % with unaccounted for rotations: 0.183%
  • Total images including squares: 54392
  • Estimated unaccounted for rotated images: 99.7
1 Like

Hi @lapp09,

Can you check the dataset version you are using?

We improved the dataset pipeline to take care of such issues in the latest release i.e. v2.1, which you can download and try it out.

Example of above images from v2.1 annotations:

UPDATE: Looking into it.

Hi @lapp09,

Thanks to bring it to our attention.

We verified with the images [containing exif] and the list of annotations where this swap exist are as follows [28 images]:

008617.jpg, 008618.jpg, 008619.jpg, 008620.jpg, 008621.jpg, 008626.jpg, 008627.jpg, 008628.jpg, 008817.jpg, 008864.jpg, 008869.jpg, 008919.jpg, 008934.jpg, 011967.jpg, 012045.jpg, 013989.jpg, 017312.jpg, 021923.jpg, 023295.jpg, 023296.jpg, 023975.jpg, 023976.jpg, 049396.jpg, 050167.jpg, 053873.jpg, 053875.jpg, 053879.jpg, 056091.jpg

NOTE: There are no new changes made in the images.

We have fixed the annotations for them in the public_training_set_release_2.1.tar.gz.

In case you want to download the annotations files without the images, you can download public_training_annotations_only_2.1.tar.gz added to the resources page now.

Best,
Shivam

Thanks for addressing.

As I mentioned in my comment, I don’t think this list of 28 is comprehensive and there are a ROUGHLY estimated 100 incorrect images (72 of which we cannot discover simply by searching for flipped HxW).

Perhaps at the end of this competition the dataset can be manually cleaned by taking the best performing model and comparing predictions to rotated real annotations. In cases where the predicted annotation better matches the rotated form, there can be human review.

1 Like

Hi @lapp09, there wouldn’t be any square images having such issue ideally. :sweat_smile:

The step which went wrong for width & height assignment, was only used in our pipeline when width != height.

To clarify, you’ve reviewed the 28 images I noted and found that their masks are correct without any rotation? That’s great news!

Yes, the masks are correct in original annotations file.

Here is a quick viz of those 28 images:

1 Like