Is it possible/allowed for the agent to get Rmax setting of the current environment? This is used, for example, in procgen baselines paper (Appendix D.2) for distributional min/max values.
I do not think there is a way to get the max reward value from the gym environments. [refer]
The Rmin and Rmax for the publicly available environments are available at
For the private env used for round 1 (caterpillar), the min and max rewards are
[R_min, R_max] = [8.25, 24] .
Please note that Rmin is the score for an agent that has no access to the observations and not the minimum possible score.
@jyotish Curious about how to reconcile this post with new announcement of no environment specific logic. Can we make use of knowing the max rewards and min rewards of each specific environment?
Yes, using the environment name to determine min/max rewards is perfectly fine.