Thanks you everyone for your feedback on Round 1! Here’s a summary of the problems encountered so far, and how we plan to address them.
TL:DR: Round 2 will be similar to Round 1 but with many more environments. The 8 hours overall time limit won’t cause submissions to fail anymore. Prizes will be announced soon. Reported bugs are being fixed. Round 2 is pushed back by one week while we address all the feedback.
EDIT: We are still hard at work addressing issues from Round 1 and preparing Round 2. To make sure everything goes well when we start the next round, we are pushing Round 2 back by an extra week (to August 14th).
The 8 hours overall time limits is too strict!
This is the most common problem: it’s very hard to get an RL solution to finish in time.
To fix this, we will make this time limit a “soft timeout”: if your submission takes more than 8 hours, it won’t be cancelled anymore, but instead all the remaining episodes that is didn’t have time to solve will receive a score of -1.0.
To make this process fair, the order of the evaluation environments will be fixed. The environments will also be ordered in increasing order of size.
The environment is too slow
The Flatland environment does get slow when running larger environments!
This is a problem in two situations. First, for submissions: in this case it could push solutions over the 8 hours overall time limit. Now that this time limit will be “soft”, this won’t be such a big problem anymore. Yes, the environment will still take a large chunk of the time during the evaluation process. But your submission will be valid even if it takes too long, and the environment takes the same amount of time for all participants, so things are fair.
Still, the speed of the environment limits how fast you can train new agents and experiment with new ideas. We will release a new version that includes a number of performance improvements to alleviate this issue for Round 2.
I don’t want people to see videos of my submissions
Some participants have expressed the wish to hide their submissions videos.
This is not something we plan to provide. Our goal is to foster open and transparent competition, and showing videos is part of the game: participants can glean some information from them to get new ideas.
One strategy would be to wait for the last minute to “hide your hand”. This is possible, but can be risky, as the number of submissions per day is limited, so it is generally better to secure a good position on the leaderboard as soon as possible!
We still don’t know what the prizes will be!
The original prizes were travel grants to NeurIPS - but sadly the conference will be fully virtual this year.
This forced us to look again for new sponsors for the prizes. While we can’t announce anything yet, things are progressing, and we’re hoping to announce exciting prizes by the time Round 2 starts.
The margin of progression for OR is too small 💇
OR solutions reached 100% of completion rate in a matter of days in Round 1, and are now fighting over thousandth of points. Since the overall time limit is now “soft”, we will simply add many more evaluation episodes including much larger environments to allow a larger margin of progression for all solutions.
Documentation is still lacking
Flatland is a complex project that has been developed by dozens of people over the last few years. We have invested a lot of energy to gather all the relevant information at flatland.aicrowd.com, but we realise there is still a lot of work ahead.
We will keep working on this, but this is a large task where your contribution is more than welcome. Contributing to the documentation would make you an official Flatland Contributor! Check out https://flatland.aicrowd.com/misc/contributing.html to see how you can help.
Various bugs are making our lives harder
Here’s a list of known bugs we plan to squash before Round 2 starts:
Debug submissions count the same a full submissions
When a submission is done, the percentages and other metrics reported in the Gitlab issues are non-sensical (“-11.36% of agents done")
Rendering bug showing agents in places where there shouldn’t be
We’re hard at work to address all these issues. We have moved the starting date of Round 2 one week back to give us time to implement and deploy all the necessary changes.
We’re still open to comments, complaints and requests! Please fill up the survey if you haven’t done so: