The current public rankings of Task2 and 3 confuse me. Since 30% of the data in the public rankings test set of Task2 and 3 appears in the training set of Task1, the scores of the current public rankings of Task2 and 3 are relatively high, which makes I can’t judge the actual ranking level. Can the official provide a “Clean” score column achieved by sampling for Task2 and 3 like Task1? I’m sorry to ask such a question, but I think it’s important to me.
Thank you for pointing this out.
We have released a new version of this dataset, where this has been addressed.
More on this here : 🚀 Dataset Update: `v0.3` of the dataset has been released 🚀