Is it possible/allowed for the agent to get Rmax setting of the current environment? This is used, for example, in procgen baselines paper (Appendix D.2) for distributional min/max values.
I do not think there is a way to get the max reward value from the gym environments. [refer]
The Rmin and Rmax for the publicly available environments are available at
For the private env used for round 1 (caterpillar), the min and max rewards are
[R_min, R_max] = [8.25, 24] .
Please note that Rmin is the score for an agent that has no access to the observations and not the minimum possible score.