v0.1 of the Task-1 test set only had the following columns :
v0.2 release, we introduced a separate column, the
product_id, and there has been some confusion as to why !
We understand, and Let us explain !
The reason the
v2.0 test set of Task 1 includes pairs of
product_id, is because we only have the
esci_labels for those specific
product_id pairs, and we can only consider those pairs for computing the nDCG score. The other
product_id pairs that do not appear in the test set will not be considered for computing the nDCG score, since we do not have the corresponding
This helps participants obtain a more meaningful nDCG score.
To help clarify this better, lets imagine a situation where we only want the top three ranked products for the query included in the test set:
Version 0.1 dataset
query_1, "some query", us
query_1, product_11 query_1, product_7 query_1, product_9
If we do not have the
esci_label for any of those three pairs (say for
query_1, product_7 pair), then we will omit those pairs, and compute the nDCG using only the
query_1, product_11 and
query_1, product_9 pairs.
Version 0.2 dataset:
query_1, product_1, locale_us query_1, product_3, locale_us query_1, product_10, locale_us
query_1, product_10 query_1, product_1, query_1, product_3 query_1, product_39
In the output, we have all the pairs annotated except the
query_1, product_39 pair.
But if the participants already had the information of which
query-product pairs had
esci_labels available, then they should not have included the
query_1, product_39 pair to begin with, as it is not included the test set.
So in this case, we will compute the nDCG score considering the 3 ranked products that were included in the test set, and not the
query_1, product_39 pair which was not included in the test set.
In other words, the question to answer is, how best to “order” the test set of Task-1, to optimize your nDCG score.
Hope that clarifies the confusion.
If you have any more queries, please do not hesitate to reach out to us.