The test data matrix is now available to ease some logistical reasons.
IMPORTANT CONSIDERATIONS SHOULD YOU CHOOSE TO ACCESS THIS FILE:
- The leaderboard is meant to emulate real world decision making scenarios where you would only have access to information in the past – immediately following your phase 2. Please keep this in mind when if attempting to leverage the test data for any learning. Please justify any choices and show generalizability of your model if leveraging additional data from the post-2016 test data set.
As with wrangling raw data - Please Consider: Solutions and predictions by teams choosing to leverage additional data for the leaderboard methods competition beyond the core training data set will be under an additional layer of oversight in both code and model generalizability to ensure no information has been leaked.
For now, find this file in your region’s respective /shared_data/data directory in test_data_full
@kelleni2 Yes it is difficult to do predictions without seeing the test data. Because whatever data wrangling steps we do on training - we have to do the same for test data. The test data you provided with 100 records will not help.
I do not see any FULL test data set in the path you mentioned above. Can you please check again?
Also let us know total records in the final test file you are sharing.
Thanks for your support.
Why on earth, again, do we have the super-complicated cobbled together AIcrowd submission setup, if we could all just submit a csv with row_id and predicted probability? I thought that whole mess was only necessary to avoid giving us the testdata? Quite frankly, it’s bizarre to do a U-turn on this with 2 weeks to go.
I guess you did consider the unavoidable leakage that will result from everyone seeing such a small test dataset, but decided that was okay (maybe that’s not too bad, but many will immediately recognize a lot of the approved drugs without even having to try to look something up).
We’d like to clarify whether it is now fine to just submit predictions instead of running any models on the evaluation server. Is that fine?
Hi, the process it extremely useful in longer run due to multiple reasons. This guarantees the reproducibility of the results and the transparency needed. We also preserves your submissions as docker images which guarantee the code to run forever on current or on future dataset even if any of the dependency is lost in public internet.
It would help to go back to the underlying motivation:
- We wanted to reduce the chance of visibly fooling ourselves with top solutions including leaked information, rendering them irrelevant for real world decision making.
- We wanted all top solutions able to be re-run by the evaluation & project team, to be interrogated for generalizability etc. By design, the kubernetes cluster and git combo enables this.
That said, we also want the best solutions possible for the larger initiative at the end of the event - which is why we were trying to ease some of the frustrations which were blocking some teams.
I discussed with the team, and we would highly encourage to continue to predict on the original test data in the evaluation clusters rather than provide a table of solutions. Especially for the final solution.
However, do what you feel you need to do as a team in order to come up with your optimal solution. But keep in mind, the final leaderboard will change when we add in the hold out test data, and winners will need their model to be validated by the evaluation team, so please make it clear how one would load and interrogate your model.
Please always feel free to reach out to me if you would like to discuss.