FAQ: Round 1 evaluations configuration

jyotish · July 17, 2020, 12:00pm

The following values will be set during the evaluations. Any changes that you make to these parameters will be dropped and replaced with the default values during the evaluations.

stop:
  timesteps_total: 8000000
  time_total_s: 7200

checkpoint_freq: 25
checkpoint_at_end: True

env_config:
  env_name: <accordingly>
  num_levels: 0
  start_level: 0
  paint_vel_info: False
  use_generated_assets: False
  distribution_mode: easy
  center_agent: True
  use_sequential_levels: False
  use_backgrounds: True
  restrict_themes: False
  use_monochrome_assets: False

# We use this to generate the videos during training
evaluation_interval: 25
evaluation_num_workers: 1
evaluation_num_episodes: 3
evaluation_config:
  num_envs_per_worker: 1
  env_config:
    render_mode: rgb_array

During the rollouts, we will also pass a rand_seed to the procgen env.

jurgisp · July 22, 2020, 11:37am

Is it still allowed to use any environment configuration during training?

Or when you say “evaluation”, you mean both training and rollouts?

jyotish · July 22, 2020, 4:42pm

Helo @jurgisp

The above configuration will be used during both training and rollouts.

xiaocheng_tang · July 29, 2020, 12:39am

Can we pass additional custom parameters in env_config (for example the number of frames to stack together like stack_frames)? I assume you are talking about a update operation where those official parameters above will be replaced with the default values but whatever user-defined additional parameters will be preserved.

jyotish · July 29, 2020, 11:39am

Hello @xiaocheng_tang

Yes, you are free to pass additional parameters/flags in env_config. The only requirement is that the base env used by your gym wrapper should be ProcgenEnvWrapper provided in the starter kit.

The discussion from Rllib custom env might be useful to clear things up.

shivam · July 30, 2020, 1:02pm

jiaxun_cui · August 13, 2020, 12:38am

I found some performance boost when set this to be True. Could you please explain why this is overwritten? Thank you very much!

jyotish · August 15, 2020, 6:05am

Hello @jiaxun_cui

The objective of the competition is to measure the sample efficiency and generalization of the RL algorithms. Changing any of the environment config flags is not relevant to the competition.

maraoz · August 15, 2020, 10:40pm

Where can we find info on the environment where the evaluations are run? (i.e: what’s installed, and how) – thanks!

shivam · August 15, 2020, 10:44pm

Hi @maraoz,

The default runtime is as defined in this Dockerfile, which is basically nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 docker image along with dependencies listed in requirements.txt.

tesla · August 16, 2020, 8:07am

@jyotish Can you explain the entry command used for the evaluation?

For example, do you run the run.sh file to start the training process?

How do you replace these above parameters by the default values?

I think this will be helpful for me to debug my code on my machine.

jyotish · August 16, 2020, 8:13am

Hello @tesla

For the training phase we run run.sh --train right now. You can specify which experiment config file to use at

github.com

AIcrowd/neurips2020-procgen-starter-kit/blob/master/run.sh#L8


#!/bin/bash
set -e

#########################################
# Your experiment file for submission   #
#########################################

export EXPERIMENT_DEFAULT="experiments/impala-baseline.yaml"
export EXPERIMENT=${EXPERIMENT:-$EXPERIMENT_DEFAULT}

if [[ -z $AICROWD_IS_GRADING ]]; then
  ##########################################################################
  # This section contains commands you would like to run, when running     #
  # the codebase on your machines. During evaluation AICROWD_IS_GRADING    #
  # variable is set, due to which this block will be skipped.              #
  ##########################################################################

  export OUTPUTS_DIR=./outputs

You can specify the CPUs/Memory that should be used by rllib for your local runs using

github.com

AIcrowd/neurips2020-procgen-starter-kit/blob/master/run.sh#L19-L21


export RAY_MEMORY_LIMIT=1500000000
export RAY_CPUS=2
export RAY_STORE_MEMORY=1000000000

Let me know if you were looking for something else.

jyotish · August 16, 2020, 8:17am

If you are okay with rllib taking up all the resources available during your local runs, you can directly run

python train.py -f experiments/<experiment-file>.yaml

PS: You need to update the EXPERIMENT_DEFAULT variable in run.sh for us to pickup the experiment file.

tesla · August 16, 2020, 8:18am

Yes, my question is that if you want to change these parameters such as env name, do you change the experiment config file specified in run.sh or other methods?

tesla · August 16, 2020, 8:20am

So, you will first scan the run.sh file, and then modify the experiment config file specified by EXPERIMENT_DEFAULT variable. Do I understand correctly?

jyotish · August 16, 2020, 8:22am

Hello @tesla

Yes. We get the experiment file to use from run.sh and override the necessary values in the experiment yaml file and start the training/rollouts.

You only need to update the experiment file to use in run.sh. Any other change will be in the respective experiment yaml file. For example, for env name, it will be here,

github.com

AIcrowd/neurips2020-procgen-starter-kit/blob/master/experiments/procgen-starter-example.yaml#L36


keep_checkpoints_num: 5

config:
    ################################################
    ################################################
    # === Settings for the Procgen Environment ===
    ################################################
    ################################################
    env_config:
        # Name of the procgen environment to train on # Note, that this parameter will be overriden during the evaluation by the AIcrowd evaluators.
        env_name: coinrun
        # The number of unique levels that can be generated. Set to 0 to use unlimited levels
        num_levels: 0
        # The lowest seed that will be used to generated levels. 'start_level' and 'num_levels' fully specify the set of possible levels
        start_level: 0
        # Paint player velocity info in the top left corner. Only supported by certain games.
        paint_vel_info: False
        # Use randomly generated assets in place of human designed assets
        use_generated_assets: False
        # center_agent : Determines whether observations are centered on the agent or display the full level. Override at your own risk.
        center_agent: True

tesla · August 16, 2020, 8:24am

Ok, thank you! I think I already know.

maraoz · August 21, 2020, 9:03pm

If you’re calculating GPU RAM constraints, bear in mind that the evaluation configuration has an additional video rendering worker, not just the trainer and rollout workers! Costed me ~5 submissions and asking @jyotish to realize this ^^

tesla · August 22, 2020, 9:24am

Thank you, it confused me some days.

scitator · September 12, 2020, 7:52pm

Dear organisers,
Is it possible to remove ray dependency from second round?
From my perspective Ray looks like a too heavy solution for this task.
Do you have any pure TF/Pytorch baselines to start from?