Hello everyone, I observe a significant performance difference between Torch and TF with same experimental setting.
Follow @ttom, @dipam_chakraborty and @jyotish , I try to reproduce the 8th solution from Solution Summary (8th) and Thoughts with Pytorch. However, with same settings, i.e. default PPO algorithm, NN model, weight init, hyperparameters and epsilon parameter in adam, TF significantly outperforms pytorch in procgen, e.g. TF scores 20 more than Pytorch in starpilot . What’s more, TF runs faster than Pytorch with about 15 mins, when environment time-steps is 8M. I try different binaries of ray (0.8.6, 0.8.7) and versions of pytorch (1.4.0, 1.5.0, 1.6.0, 1.7.0), but get the same results.
I also notice that TF and Pytorch have different padding strategies, see https://stackoverflow.com/questions/61422046/resnet-model-of-pytorch-and-tensorflow-give-different-results-when-stride-2
Is there any suggestions to this problem? Did anyone get high scores in the competition with Pytorch?