Looking at the example you provided it seem that agent 1 want to enter an occupied cell.
This has to do with the way the env.step executes the commands. It does take agent actions and executes them serially by increasing agent ID.
This means that agent 1 tries to enter cell (6,9), which is still occupied by agaent two who has not yet moved to cell (6,10). Because moving in to an occupied cell is an illegal action, the env does not execute the action and agent 1 stays outside the env and ready to depart.
I see your point.
It just seems strange that this is handled different during the game.
So it seems during the game you check next_field_occupied_at(t + 1)
while for spawning you check check next_field_occupied_at(t)
which means if you would check for next_field_occupied_at(t + 1) spawning as well it would be ok for agent 1 to move to (6,9) since a t+1 (6,9) will not be occupied.
log i sent you is a good example for it. So if you always would check for next_field_occupied_at(t) which you explained me you do at spawning. Then it would be impossible for trains to drive directly after each other since then the next filed will always be occupied. But It is possible for two trains to drive directly after each other.
You could try to reproduce this by trying to actively spawn agents at the same stating point directly after each other.
you will see that there will be a gap between this trains.
This will highly depend on the ordering of the trains.
This i a design choice and there would be other solutions to this problem. In the case of flatland, trains can drive behind each other as long as the index of the train in front is lower than the index of the train behind it. For example.
Train 2 can follow train 1 directly. But train 1 CANNOT follow train 2 directly.
Does this help?
Which log are you refereing to? I’m happy to take a look at it.