Can you please clarify:
Does each train has an individual target time (deadline) after which is it considers as a fail (i.e., not done) or all the train should simply arrive by the time the simulation complete, given by:
The time step is per Episode. This means this is the total number of environment steps you can call before the enviromnent terminates.
Currently we don’t put stricter schedules on the individual trains, thus each individual train can run whenever it is best suited within the total time window.