Questions about the competition timeline

I have two questions about the timeline,

  1. When will the competition end? Is it Oct 19 or Oct 31?

  2. When can we test the generalization performance instead of sample efficiency?



It’s odd that there is no answer from organizers, with less than 24h remaining until end of competition deadline. I can only assume it is not the actual deadline, and round 2 will be extended?

1 Like

Hello @jurgisp @Feiyang

The competition will run till Oct 31. For the last 7 days, the submission quota will be reduced to 2 submissions per day and we will use on-demand instances for evaluation instead of spot instances. We will make an announcement once the change is done.


Hello @jyotish … Will there be a generalization track? If yes, can we select which submissions to use for generalization?

1 Like

Hello @dipam_chakraborty

The generalization track will start once the round 2 ends. You will be given a choice to pick a submission. We will make an announcement regarding this in a few days.

Hey @jyotish, does the submission we pick have to be something we submitted previously, or it can be a new submission after round 2 ends? It doesn’t really make sense to pick something we previously submitted because those are for sample efficiency only.


@jyotish can you confirm, that we need to make a submission, which will be used for generalization track, before round 2 ends?

Hello @quang_tran @jurgisp

We want to optimize both sample efficiency and generalization. So, we would expect an existing solution to perform well in terms of generalization as well. You can choose the submission that will be used for the final evaluation. The same submission will be used for evaluating sample efficiency (8M steps on all environments) and generalization (limiting the levels to 200).

1 Like

Hi @jyotish, thanks for clarification! I had an impression that we can choose different submissions for sample efficiency and for generalization, but if it has to be just one best submission that is evaluated on both, that’s clear.


Hello @jyotish

Thanks for the clarification, however this raises further questions. I think sample efficiency and generalization is a trade-off towards the end of training, which means with one submission we can be high scoring on sample efficiency but poor on generalization. Or we can improve generalization but reduce sample efficiency.

So the scenarios are:

There two tracks, two env configs while training, and two separate scoring metrics.


There is one track, one env config while training, one joint scoring metric.

Please clarify, which of the above is the case?

If there are two tracks (and two env configs while training), the selected submission can be used be near the top of sample efficiency but low generalization, or be in the middle of both leaderboards. Else, if there is a joint metric, we’d like to test that locally. We’ll plan our submissions accordingly.

1 Like

Dipam is correct. There is a trade-off between the two tracks, and personalized solutions for each track do perform better in each track.
I also agree with the two choices for scoring. I would be against squashing both tracks into one.
During the competition I mostly relied on the AIcrowd platform for evaluating solutions, and seldom trained them locally. As so, I did not optimize for the generalization track.