🚨🚨🚨 Dataset update - `v0.2` released!

mohanty · March 31, 2022, 11:48pm

Hello everyone!

We updated the dataset to address the duplicate query_id issue and released v0.2 version of the dataset.

Changes in the new version

Additional product_id column in task 1 test set.
- More information about the product_id column is available at : 😱 Why is there a `product_id` column in the Task 1 test set ? 😱
All the CSV files except the sample submission are zipped. But you should still be able to load them in pandas. For example,
```
import pandas as pd
df = pd.read_csv("path/to/test_public-v0.2.csv.zip")
```

More details about the dataset is available at the updated post here : 🚀 Datasets Released & Submissions Open 🚀

Best of Luck!

yrquni · April 4, 2022, 7:58pm

Excuse me, relative to v0.1, does v0.2 update only Task1’s dataset without making changes to other task datasets?

mohanty · April 20, 2022, 5:04pm

@yrquni : Yes, that is correct. We do so, to avoid confusion across different versions of the datasets floating around. And the versioning system we use at the moment is for the combined datasets of all the tasks.