In your explanation, there are 14 domains in this dataset.
Specifically, CRAG-MM features a diverse collection of 5k images , including 3k egocentric ones captured by RayBan Meta smart glasses , covering 14 domains and reflecting real-world challenges associated with handling egocentric images.
However, We found this dataset has only 12domains.
Moreover, there are two instances of ‘plants and gardening’.
ds = load_dataset("crag-mm-2025/crag-mm-single-turn-public")
valid_data = ds['validation']
print(valid_data.features["turns"][0]["domain"].names)
> 'animal', 'shopping', 'plants and gardening', 'general object recognition', 'plants and gardening ', 'math and science', 'vehicle', 'other', 'text understanding', 'food', 'local', 'book'
Please tell us which is correct.