The evaluation result does not match my local testing

rolanchen · October 24, 2019, 5:32am

Hi all,

I have just submit a pretrain version for evaluation, yet the result indicates that it goes only 1 episode with 4 steps; however when I ran the evaluate_local.sh on my own machine the output is 5 episodes with each at least 5k steps. The code files are totally identical. Anyone have any ideas about this situation? Thx.

sungbinchoi · October 24, 2019, 12:55pm

I have a same problem.
Evaluation successful but only 1 episode with 4 steps.

rolanchen · October 24, 2019, 2:25pm

Hey, I just solved mine. The issue is about the Tensorflow version. By default the system will install TF2.0, while my code is running on TF1.13, and the thing unexpected is the system not reporting any error about it. HTH.

sungbinchoi · October 24, 2019, 3:37pm

My code is also running on TF 1.13.1.
But TF version 1.13.1 is specified on my environment.yml file.
I’m still confused about what went wrong.

shadowyzy · October 25, 2019, 1:35am

I have a same problem.
Evaluation successful but only 1 episode with 4 steps.
But I used pytorch, and I don’t know what really happened.

rolanchen · October 25, 2019, 1:40am

As far as i can tell, it seems that 4-steps-1-episode thing happen whenever sth going wrong inside the main script, while the system won’t report any of it (which sucks indeed). So what I can do is deleting part of my codes one by one and submit it again and again, until finding out the critical part…

rolanchen · October 25, 2019, 1:40am

Too bad.
Check my above replay, see if could help.

shadowyzy · October 25, 2019, 1:45am

Thanks a lot! But the new maximum number of submissions is 25. That is too costly …