After running evaluations for 20+ times, I’ve noticed I always get to 8~10th floor in the fifth round of the evaluation, but not the rest. So it is always like [5, 5, 6, 6, 10], [5, 5, 6, 7, 9], [5, 5, 6, 5, 10], etc.
As I have understood it, the seeds are fixed (e.g. here Juliani tests one of the seeds). The wording on main page about random five seeds probably refers to five randomly preselected seeds.
We are using five fixed seeds for evaluation that are outside the range available during training. They are the same every time you run evaluation. That being said, the agents behavior itself can often not be deterministic, depending on the algorithm you use. There is also some slight non-determinism in the Unity engine physics, which can result in slightly different results between runs. Hopefully that clarifies things for you.
Due to the very stochastic nature, I think more trials on the evaluation seeds would mitigate the strongly varying results. So just resubmitting the same agent may yield in much better or much worse performance.