[ANNOUNCEMENT] Updated Rules and Clarification




We have updated the rules to clarify how scoring and final ranking will be done in this challenge.

tl;dr : Prizes will be awarded according to ROUND 2 Leaderboard only

Why the different rounds?

We use the different rounds to introduce the complex problem of vehicle re-scheduling in a step by step manner.

Round 0 (Beta):
This round was intended to find remaining bugs in the flatland environment as well as give a simple introduction to the repository.

Round 1 Avoiding Conflicts:
In contrast to other means of transportation, a railway network is prone to dead-locks, where trains are unable to resolve the conflict. This round is intended for participants to figure out how to detect such dead-locks in a timely manner and avoid them.

Round 2 Optimizing Traffic:
The complexity of this round is far superior to the previous rounds. Introducing mixed speed profiles for different trains in the network leads to new traffic management problems. In addition to avoiding dead-locks the ordering of trains on a given route as well as the chosen routes will impact the punctuality of the trains. This round is very close to the real world problem faced in daily operations by any railway company. Given the Complexity of this last round in contrast to the previous rounds it outweighs them in the final scoring. Therefore only this round is considered for the final scoring.

We clarified this in the updated rules as there where two conflicting statements in the previous version of the rules.

We wish you a lot of fun with the challenge and encourage you to openly discuss ideas on how to tackle this real world problem with novel approaches.



As you can see on the leaderboard already, avoiding conflicts and reaching destinations within the maximum allowed time steps is rather easy in Round 1 (meaning all the 1000 secret cases can be solved perfectly from this perspective). The only interesting part remaining in Round 1, in my opinion, is trying to maximize the mean reward. This is a non-trivial task and I personally have many ideas that i would have liked to try. However, given that Round 1 will not count towards the final standings, and given that I don’t know too many details about the rules and test sizes for Round 2, I am now reluctant to spend any more time to improve the mean reward for Round 1, since it’s possible that any techniques I will use/develop for this will be unusable in Round 2.

My personal preference is to start Round 2 as soon as possible, in order to start solving the interesting problems :slight_smile: Is the time line for Round 2 still the one mentioned in the Overview section? (from mid-August to December 1st?)


Hi @mugurelionut

This was an expected outcome and one of the reasons we chose to do the challenge setup with these distinct rounds. We highly encourage the use of both classical OR algorithms as well as novel RL solutions and thus set round 0 to be well suited for OR algorithms, whereas round 2 with introduced stochasticity is better suited for learning algorithms.

Round 1 instances can be solved using classical planning algorithms, because there is no stochasticity involved in the enivornment and the complexity is rather low due to the speed of all trains being equal. This round serves as a kind of benchmark for RL approaches to see if they can achieve similar results as classical approaches.

Round 2 will differ from round 1 in that there are stochastic events occuring during an episode, so pre-planned trajectories will not be feasible anymore. Also the introduction of different speed profiles for different agents will enhance complexity.

I understand that tuning the optimality for round 1 is not the most interesting challenge. I therefore encourage you to prepare your code for stochastic events such as:

  • blocked cells forcing agents to reroute
  • delayed departures causing agents to start later than expected
  • unexecuted actions causing agents to replan their actions

We are still aiming for mid August release and will update you on more details about how stochastic events will be represented and with what probability they will occur.

We also encourage you to discuss among other participants how you solved the task and share your code publicly (there is also a community prize being awarded for contributions to the whole community).

If you have code that can be integrated into the FLATLAND to help everybody obtain better results don’t hesitate to contact us or open an issue on gitlab.

Best regards


My team and I are ready to start working and solution for Round 2, but since the rules have changed a bit we would like to get more information about Round 2 before we start.

We would like to know:

  • what test sizes should we expect
  • if you are thinking about adding more stochastic events or change some of them (initially only different agent speeds were mentioned)
  • will the time restriction for testing/evaluating remain the same as it is in Round 1?

I believe everyone would appreciate a post from you where we could find details for Round 1 and/or Round 2. This post should ideally also include information about what is set in stone, what may change, etc… There is important information in some threads in this forum, but I would love to find everything in one place. I think newcomers would appreciate even more since there is quite old information in the overview of the challenge.


Hi @kalinja6

Thank you for your message. We agree that more information would be great for the whole community.
We have had many discussion with the people involved in planning and scheduling here at SBB in order to formulate the challenge as close to real life scenarios as possible.

We are currently working on these modifications and I will update the landing page, add a FAQ and make an Announcement here in the forums tomorrow about round two and what the new enhanced difficulties will be.

Thank you for your patience and your participation in this challenge.

Feel free to reach out to us if there is any further questions or suggestions.

Best regards,

The Flatland Team