:aicrowd: [Update] Round 2 of Data Purchasing Challenge is now live!

Hi @dipam, is there any update about this?
Or, please let me know if you already decided to stick to current training pipeline. I’ll try to optimize my purchase strategy to the one.

Hi @tfriedel

Seems I missed your message. The point on ruling out the most useful images, do you feel its a huge issue with the current training pipeline?

We do want the solutions to be as agnostic to the training pipeline while buying the best images possible. Yes its not completely possible to make things training agnostic, but that is the spirit of the competition we’d like to promote. If you’re finding that you’re deliberately having to remove too much of the useful images, please let me know.

The question is how you define the best or useful images. If it’s the best for improving 10 epochs effnet-b4(which I suspect underfitting), the current scheme makes sense.
But in practice, I geuss people would decide to add data after trying to Improve the model with the current data and finding the performance still doesn’t reach to the expected one.
So my definition of “useful” here is “useful to improve the performance of well enough finetuned model”. And I suspect current post training pipeline doesn’t reach to the level, IMHO.


Yes what you say makes sense, although on the other end a very strong model was getting nearly as good as “all label purchase” scores with just random purchase, so the dataset needed more difficulty, important lessons learnt. In any case, I agree with your definition of useful, for now we’ve come up with the end of competition evaluations scheme. Please check this recent post.

1 Like