Hi, when using the command “env.observation_space.high” and “env.observation_space.low” for the above-mentioned two environments, it shows + \infty and - \infty respectively for all elements.
Does this mean that we just hardcode the limits (from the given table) for the two environments or will this issue be corrected, please do let us know!
In addition, when sampling the next state through env.step(), the value for theta_dot exceeds 5.
The deepmind environments in Bsuite don’t provide actual limits for the environments, so we have estimated it by running our own algorithms.
You are free to use whatever limits you find and any binning strategy you want, the goal is to maximize the scores for the environments. Consider that part of the challenge.
In case its a huge deviation from our provided limits, do let us know. (A colab link to reproduce the limits you find would be even more helpful).