I just recently found that the training is usually much worse on the cluster compared to the experiments on my local machine. There might be some randomness in the training statistically, but the distribution of result seems to be biased to be always worse when submitting. Is there anyone with similar experiences? Or can you provide some suggestions? Thank you
Can you elaborate more on this?
distribution of result seems to be biased to be always worse when submitting
Do you mean the agent doesn’t learn as good as it does locally on your end? Or do you mean that you are getting lower score than expected during the rollouts?
You should be able to replicate the evaluation setup using the config from FAQ: Round 1 evaluations configuration