Episode in evaluation ends prematurely

My latest evaluation yielded this:

  • Episode-5
    • State: Episode Complete :tada:
    • Floor Number: 2
    • Reward: 2.400
    • Steps: 649

I find this rich, considering that there is no way to die on the third floor without running out of time, and an agent starts out with 3000 frames of time even if they pick up no time orbs. Clearly, there is some bug in the environment or the evaluation.

Anyway, I just resubmitted and will hopefully not hit this bug again.

Maybe it’s reporting steps in action-steps. Although there are 3000 time units at start each env step is 5 time units.

Nevertheless we also find the evaluation results a bit weird.

Hi all,

We use a “frame-skip” of 5 within Obstacle Tower, so 3000 environment steps would be equivalent to 600 agent steps. If your agent is able to pick up orbs along the way, that would likely account for a time of 649.