The instructions state that
An agent can move in any arbitrary direction (if the environment permits it) and transition from one cell to the next. If the agent chooses a valid action, the corresponding transition will be executed and the agent’s position and orientation is updated.
Does this mean that an agent can change its direction of movement on any cell type?
(even if this is not usual for real trains)
For example, if a train has moved from cell A to B, can it decide to go back to A by the same transition?