In order to train a model locally and visualizably, using the following code we get an dict observation_2 that contains 4 keys and 339 numbers. These numbers contains an 2×11×11 2D target velocities on an 11×11 grid and the 97D body state(242+97=339).
[observation_2, reward, done, info] = env.step(_action,obs_as_dict=True)
Here it says S is a 97D vector representing the body state.
But in the evaluation environment which is not visualizable, we can use the following code to get an dict observation containing 14 keys and 688 numbers,
[observation, reward, done, info] = client.env_step(_action)
And when I use the first code to get the observation_2, it reports the following error:
Attempt to call
stepfunction after max_steps=1000 in a single simulation. Please reset your environment before calling the
stepfunction after max_step s
and here is the question: how can i get the 97D body state in the evaluation environment?
thank you very much.