Thanks for pointing this out, I’ll follow up with the participants.
Of course in general, we cannot prevent wrongly marked submissions from showing up on the leaderboard if the users don’t tell us to remove them.
However, for the data constrained leaderboards of both CDX (LB A) and MDX (LB A and B), we will collect the training code from the winners and reproduce the results, hence any wrongly marked submissions will certainly be removed in the final leaderboards.
I just realized these are the exact same scores of the Baseline XUMX-M model for leaderboard C. They have the same Phase 1 scores. But, since the baseline was not run for Phase 2, it’s not immediately apparent.
@subatomicseer , I inspected the inference code and indeed the submissions you mentioned were baseline models, not allowed for leaderboard B and C. Thank you for mentioning these.