The v0.1 of the Task-1 test set only had the following columns : query_id, query, query_locale.
In the v0.2 release, we introduced a separate column, the product_id, and there has been some confusion as to why !
We understand, and Let us explain !
The reason the v2.0 test set of Task 1 includes pairs of query_id and product_id, is because we only have the esci_labels for those specific query_id and product_id pairs, and we can only consider those pairs for computing the nDCG score. The other query_id and product_id pairs that do not appear in the test set will not be considered for computing the nDCG score, since we do not have the corresponding esci_label.
This helps participants obtain a more meaningful nDCG score.
To help clarify this better, lets imagine a situation where we only want the top three ranked products for the query included in the test set:
Version 0.1 dataset
Input:
query_1, "some query", us
Output:
query_1, product_11
query_1, product_7
query_1, product_9
If we do not have the esci_label for any of those three pairs (say for query_1, product_7 pair), then we will omit those pairs, and compute the nDCG using only the query_1, product_11 and query_1, product_9 pairs.
Version 0.2 dataset:
Input:
query_1, product_1, locale_us
query_1, product_3, locale_us
query_1, product_10, locale_us
Output:
query_1, product_10
query_1, product_1,
query_1, product_3
query_1, product_39
In the output, we have all the pairs annotated except the query_1, product_39 pair.
But if the participants already had the information of which query-product pairs had esci_labels available, then they should not have included the query_1, product_39 pair to begin with, as it is not included the test set.
So in this case, we will compute the nDCG score considering the 3 ranked products that were included in the test set, and not the query_1, product_39 pair which was not included in the test set.
In other words, the question to answer is, how best to “order” the test set of Task-1, to optimize your nDCG score.
Hope that clarifies the confusion.
If you have any more queries, please do not hesitate to reach out to us.
Best,
Mohanty