So I was a little bored and decided to see how well I could play the procgen games myself.
python -m procgen.interactive --distribution-mode easy --vision agent --env-name coinrun
First I tried each game for 5-10 episodes to figure out what the keys do, how the game works, etc.
Then I played each game 100 times and logged the rewards. Here are the results:
|Environment||Mean reward||Mean normalized reward|
The mean normalized score over all games was 0.882. It stayed relatively constant throughout the 100 episodes, i.e. I didn’t improve much while playing.
I’m not sure how useful this result would be as a “human benchmark” though - I could easily achieve ~1.000 score given enough time to think on each frame. Also, human visual reaction time is ~250ms, which at 15 fps would translate to us being at least 4 frames behind on our actions, which can be important for games like starpilot, chaser and some others.