🚨🚨🚨 Dataset update - `v0.2` released!

Hello everyone!

We updated the dataset to address the duplicate query_id issue and released v0.2 version of the dataset.


Changes in the new version

  • Additional product_id column in task 1 test set.
  • All the CSV files except the sample submission are zipped. But you should still be able to load them in pandas. For example,
    import pandas as pd
    df = pd.read_csv("path/to/test_public-v0.2.csv.zip")
    

More details about the dataset is available at the updated post here : 🚀 Datasets Released & Submissions Open 🚀

Best of Luck!

3 Likes

Excuse me, relative to v0.1, does v0.2 update only Task1’s dataset without making changes to other task datasets?

@yrquni : Yes, that is correct. We do so, to avoid confusion across different versions of the datasets floating around. And the versioning system we use at the moment is for the combined datasets of all the tasks.