The following values will be set during the evaluations. Any changes that you make to these parameters will be dropped and replaced with the default values during the evaluations.
stop:
timesteps_total: 8000000
time_total_s: 7200
checkpoint_freq: 25
checkpoint_at_end: True
env_config:
env_name: <accordingly>
num_levels: 0
start_level: 0
paint_vel_info: False
use_generated_assets: False
distribution_mode: easy
center_agent: True
use_sequential_levels: False
use_backgrounds: True
restrict_themes: False
use_monochrome_assets: False
# We use this to generate the videos during training
evaluation_interval: 25
evaluation_num_workers: 1
evaluation_num_episodes: 3
evaluation_config:
num_envs_per_worker: 1
env_config:
render_mode: rgb_array
During the rollouts, we will also pass a rand_seed to the procgen env.
Can we pass additional custom parameters in env_config (for example the number of frames to stack together like stack_frames)? I assume you are talking about a update operation where those official parameters above will be replaced with the default values but whatever user-defined additional parameters will be preserved.
Yes, you are free to pass additional parameters/flags in env_config. The only requirement is that the base env used by your gym wrapper should be ProcgenEnvWrapper provided in the starter kit.
The discussion from Rllib custom env might be useful to clear things up.
The objective of the competition is to measure the sample efficiency and generalization of the RL algorithms. Changing any of the environment config flags is not relevant to the competition.
The default runtime is as defined in this Dockerfile, which is basically nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 docker image along with dependencies listed in requirements.txt.
Yes, my question is that if you want to change these parameters such as env name, do you change the experiment config file specified in run.sh or other methods?
So, you will first scan the run.sh file, and then modify the experiment config file specified by EXPERIMENT_DEFAULT variable. Do I understand correctly?
Yes. We get the experiment file to use from run.sh and override the necessary values in the experiment yaml file and start the training/rollouts.
You only need to update the experiment file to use in run.sh. Any other change will be in the respective experiment yaml file. For example, for env name, it will be here,
If you’re calculating GPU RAM constraints, bear in mind that the evaluation configuration has an additional video rendering worker, not just the trainer and rollout workers! Costed me ~5 submissions and asking @jyotish to realize this ^^
Dear organisers,
Is it possible to remove ray dependency from second round?
From my perspective Ray looks like a too heavy solution for this task.
Do you have any pure TF/Pytorch baselines to start from?