Round 2 is open for submissions 🚀

jyotish · September 23, 2020, 9:29am

Hello all!

Thank you for your participation and enthusiasm during round 1! We are accepting submissions for round 2 for the Top 50 teams from Round 1.

Hardware available for evaluations

The evaluations will run on the following hardware (AWS sagemaker ml.p3.2xlarge instances):

Resources
vCPUs	8
RAM	60 GB
GPU	16 GB Tesla V100

Note: The training and rollouts for each environment will run on a separate node.

Evaluation configuration

The configuration used during evaluations is available at FAQ: Round 1 evaluations configuration.

Environments

This round will run on six public (coinrun, bigfish, miner, plunder, starpilot, chaser) and four private environments.

Scoring

The final score will be an average of mean normalized rewards for public environments and the private environment

Score = \frac{1}{12}*R_{coinrun} + \frac{1}{12}*R_{bigfish} + \frac{1}{12}*R_{miner} + \frac{1}{12}*R_{chaser} + \frac{1}{12}*R_{starpilot} \\ + \frac{1}{12}*R_{plunder} + \frac{1}{8}*R_{privateEnv1} + \frac{1}{8}*R_{privateEnv2} + \frac{1}{8}*R_{privateEnv3} + \frac{1}{8}*R_{privateEnv4}

R_{env} = Mean normalized reward for env.

shivam · September 23, 2020, 9:38am

karolisram · September 23, 2020, 9:42am

Will we be able to choose which submission to use for the final 16+4 evaluation? It might be the case that our best solution that was tested locally on 16 envs is not the same as the best one for the 6+4 envs on public LB.

shivam · September 23, 2020, 9:45am

Hi @karolisram,

We were thinking about picking top 3 submissions, but I think it makes sense to let participants pick 3 submissions themselves.

We will float Google Form at the end of Round 2 for the same. (picking top-3 as default, if participant don’t fill it)

Cheers,
Shivam

karolisram · September 23, 2020, 9:56am

Sounds good, thanks @shivam . Could you please also give us the normalization factors for the 4 private envs (Rmin, Rmax) ?

shivam · September 23, 2020, 9:58am

It has been updated in the starter kit, along with few more changes for Round 2 (old fork will continue to work), here:

github.com

AIcrowd/neurips2020-procgen-starter-kit/blob/master/aicrowd_helpers/config.py


EASY_GAME_RANGES = {
    'coinrun': [0, 5, 10],
    'starpilot': [0, 2.5, 64],
    'caveflyer': [0, 3.5, 12],
    'dodgeball': [0, 1.5, 19],
    'fruitbot': [-12, -1.5, 32.4],
    'chaser': [0, .5, 13],
    'miner': [0, 1.5, 13],
    'jumper': [0, 1, 10],
    'leaper': [0, 1.5, 10],
    'maze': [0, 5, 10],
    'bigfish': [0, 1, 40],
    'heist': [0, 3.5, 10],
    'climber': [0, 2, 12.6],
    'plunder': [0, 4.5, 30],
    'ninja': [0, 3.5, 10],
    'bossfight': [0, .5, 13],
}

jyotish · September 23, 2020, 10:21am

Hello @karolisram

As @shivam shared, the return_min, return_blind, return_max for the public envs are available in the starter kit. Please refer Min and Max rewards for an environment for information on how to use them in your code.

jurgisp · September 23, 2020, 10:52am

What about the generalization benchmark? Are we training with unlimited or only 200 episode seeds now?

jyotish · September 23, 2020, 10:58am

Hello @jurgisp

In round 2, we will train on unlimited levels. Towards the end of round 2, we will run two tracks, one for generalization and one for sample efficiency on all 16 public envs and 4 private envs.

the_raven_chaser · September 24, 2020, 12:04am

Hi @jyotish

What’s the use of the blind reward?

Paseul · September 24, 2020, 12:11am

Hi @jyotish
Can you tell me the score for the private environment? I need a score for gemjourney, hovercraft, safezone.

dipam_chakraborty · September 24, 2020, 4:04pm

@Paseul I think the normalization ranges for the private envs can be obtained if you log the “return max” and “return blind” as custom metrics. I haven’t tried it though.

maraoz · September 29, 2020, 5:49pm

Hi! I didn’t receive any emails or notifications about Round 2 starting, does that mean I’m out of the competition?
update: I seem able to make submissions still.

jyotish · September 29, 2020, 6:30pm

Hello @Paseul

These are the [return_blind, return_max] for the private envs.

'caterpillar': [8.25, 24],
'gemjourney': [1.1, 16],
'hovercraft': [0.2, 18],
'safezone': [0.2, 10],

The return_min for all the four envs is 0. Please note that return_blind is used as the minimum reward when normalizing the rewards.

patrick_macalpine · October 11, 2020, 8:11am

Can you give any more details on when the generalization track evaluations will start? Thanks.