IMPORTANT: Details about end of competition evaluations 🎯

dipam · March 24, 2022, 5:29pm

Hi Everyone,

I’ve seen multiple people’s posts suggesting changes in the purchase phase training pipeline. This post is to clarify some details about the end of competition evaluations and explain some its implications.

First of all, thanks a lot to everyone who have provided feedback for the training pipeline. It has been very valuable to us for the challenge design.

TL;Dr - Please submit your best purchase strategies, you’ll need to select them for end of competition evaluations which will run 5 post purchase training pipelines.

Current problems

Many have pointed out that the training pipeline does not provide good scores. I’ll break these down some categories.

Concerns about the best purchases not being incentivised correctly
1. Model is too weak and cannot learn hard examples - This is a legitimate concern, though I do not currently know the scale to which it is applicable. We’re investigating this.
2. Not using GaussianBlur in testing changes the optimal labels - Currently I have not found any strong evidence of this, but would love to discuss further.
3. Model does not converge due to low epochs hence too stochastic - I did not find the stochasticity in scores to be too much and feel it is in expected practical limits. However, in the later part of this post I’ll be addressing this further.
These are the most important concerns we’re looking to resolve. Some of them are also difficult to measure, and the discussion becomes qualitative and opinionated. We want to resolve it in the most quantitative and fair way possible.
Concerns that the score is too low
1. The feature layers are frozen
2. The model is too small and plateaus
3. GaussianBlur is not used in test
For these, I’d like to reiterate that the score is not important, rather, maximising the score by making the best purchase strategies is the goal of the competition.

End of competition Evaluations

In addition to changing the dataset for end of competition evaluations. We will run select submissions on multiple training pipelines in the post purchase phase.

The detailed steps are given below:

Eligible teams will select two of their submissions to evaluate - Eligibility criteria to be announced soon, it will be based on Round 2 leaderboard.
Each submission will run through the pre-train and the purchase phase on the end of competition dataset.
The same purchased labels will be put through 5 training pipelines - Details to be released soon.
Each training pipeline will be run for 2 seeds and scores averaged, to address any stochasticity in scores.
To avoid issues due to difference of average scores from different training pipelines, a Borda ranking system will be used.

We hope this will inceltivize participants to select the best purchase strategies and not optimize for the current training pipeline. We’re unable to incorporate this setup during the live round due to the prohibitive cost of compute for each run.

Training pipeline survey

Please vote here for your favoured schemes. Note that is not a vote to select the training pipelines, just a survey of participant’s preferences, but we’ll take the results very seriously.

Unfreeze feature layers in base model
Use Gaussian Blur During Test
Train for more epochs
Use bigger model - Efficientnet-B7
Train more epochs + Unfreeze feature layers in base model
Train more epochs + Use Gaussian Blur During Test
Remove Gaussian Blur from training
Unfreeze feature layers in base model + Remove Gaussian Blur from training

0 voters

Please feel free to suggest any other changes if I have missed them.

ArtemVoronov · March 24, 2022, 8:33pm

Hi! Survey now requires exact 4 options to be selected, what if I want to select only 1 or 2?

dipam_chakraborty · March 25, 2022, 1:14am

@ArtemVoronov Its intentional. Please select 4.

Camaro · March 25, 2022, 2:15pm

@dipam Voted, thanks for considering our opinions seriously.
I’m convinced that it will lead to good results for everyone!
By the way, there are only 2 weeks left, are you planning to extend the deadline or keep it as it is?

dipam · March 27, 2022, 2:52pm

@Camaro Probably no extension as far as I know. Will inform in case of any changes.

chuifeng · March 31, 2022, 3:17am

@dipam hi, I would like to know when these changes will take effect

dipam_chakraborty · March 31, 2022, 3:31am

Hi @chuifeng

There will be no live submissions with these changes. You will have to select 2 of your submissions from Round 2 for the final evaluations. Form and instructions selecting submissions will be provided before the competition ends.

chuifeng · March 31, 2022, 3:44am

I get it. thanks for your reply