Hello!
Please ensure that your submissions honour the sampling budget of 8M timesteps when using frame skips in your environment wrappers.
For example, if your wrapper skips every other frame, 8M timesteps on the procgen environment will be 4M timesteps on your environment. In this case, you need to set the timesteps_total
to 4 million instead of 8 million.
Why should we do this?
We want to ensure that all submissions use the same number of timesteps from the underlying environment. Using frame skip (or similar wrappers), changes the meaning of a “timestep” for the wrapped environment. One of the objectives of the competition is to measure the sample efficiency of RL algorithms, i.e., to measure how good an RL algorithm performs with exposure to limited data. It’s essential that all participants stick to the same data budget. Therefore, it is up to individual participants to adjust the timesteps_total
parameter, when necessary, so that their submission uses no more than 8M timesteps from the underlying environment.
What happens to the submissions not honouring the sampling budget?
Any submission not adhering to the sampling budget will be invalidated on manual code inspection and will not be considered for the leaderboard.