Out of the four games, it is shown that some games (e.g. coinrun
, bigfish
) train for 2.5M timesteps while only miner
trains for 8M timesteps (see submission #75308). Is it a plotting bug or the numbers of training steps are actually different?
Hello @kaixin
This looks like a bug in plotting on the issue page. The complete plots and the corresponding logs (for 8M steps) are available on the submission dashboard. You can access the dashboard by following this link