🚂 Here comes Round 2! (deadline: Nov 6)

MasterScrat · September 28, 2020, 10:33pm

Round 2 is starting!

Many thanks to the participants who have experimented with Round 2 for the past weeks and helped us iron bugs out. Flatland is an active research project, and as such it is always a challenge to come up with a good problem definition and a stable evaluation setup.

Problem Statement in Round 2

In Round 1, your submissions had to solve a fixed number of environments within an 8 hours time limit.

Timeouts were a major challenge in Round 1: submissions would fail if the agents wouldn’t act fast enough, and it was hard to complete all the evaluation episodes within the time limit (especially for RL!).

In Round 2, we have done everything to make the experience smoother. Your submission needs to solve as many environments as possible in 8 hours. There are enough environments so that even the fastest of solutions couldn’t solve them all in 8 hours.

This removes a lot of problems:

If your submission is slow, it will solve fewer episodes, but will still show up on the leaderboard.
If your submission crashes at some point, you will still receive the points accumulated until the crash.
If your submission takes too long for a given timestep, you won’t receive any reward for that episode but the evaluation will keep going (this was already the case at the end of Round 1).

The environments start very small, and have increasingly larger sizes. The evaluation stops if the percentage of agents reaching their targets drops below 25% (averaged over 10 episodes), or after 8h, whichever comes first. Each solved environment awards you points, and the goal is to get as many points as possible.

Read the doc for more details about the rewards: https://flatland.aicrowd.com/getting-started/prize-and-metrics.html

This means that the challenge is not only to find the best solutions possible, but also to find solutions quickly. This is consistent with the business requirements of railway companies: it’s important for them to be able to re-route trains as fast as possible when a malfunction occurs.

As in Round 1, the environment specifications are publicly accessible.

Note: Now that the environments are evaluated in order (from small to large), you should test your submissions locally in the same conditions. You can use the --shuffle flag when calling the evaluator to get a consistent behavior:

flatland-evaluator --shuffle False

Rule updates

Here’s what changed from Round 1:

Submissions now have to solve as many environments as possible in 8 hours (see above).
The time limits are now: 10 seconds per timestep, 10 minutes per pre-planning (double from Round 1)
Evaluations will be interrupted after 10 consecutive timeouts (same as Round 1).
The submission limits are now: 10 debug & 5 non-debug submissions per day (24h sliding window).
Round 2 is starting late, as a result we moved the Round 2 end date to November 6th.

New Starter Kit - Submittable out of the box!

Writing your first submission can be a bit of a challenge: you need to get used to the AIcrowd submission system, list the correct software dependencies, make sure your code respects the time limits…

We have updated the starter kit: instead of a random agent, it now contains a fully functional PyTorch DQN RL agent that you can submit right away!

Updated Starter Kit
DQN agent documentation

New Flatland Release

We have published a new release of flatland-rl: version 2.2.2.

It includes the improvements we have mentioned in previous posts:

Better performance, especially for smaller environments (https://discourse.aicrowd.com/t/round-1-has-finished-round-2-is-starting-soon/3465)
Train Close Following, which helps RL algorithms to learn (https://discourse.aicrowd.com/t/train-close-following/3464)
Many small bugfixes and improvements

Prize updates

Thanks to our partner nvidia, we are happy to announce some prizes for this challenge!

RL solutions:

1st prize: GeForce RTX 2080 Graphics Card
2nd prize: NVIDIA Jetson Nano Developer Kit

Other solutions:

1st prize: NVIDIA Jetson Nano Developer Kit
2nd prize: NVIDIA Jetson Nano Developer Kit

The original 4 travel grants to NeurIPS are replaced by travel grants to visit us at EPFL (Lausanne, Switzerland).

We are still open to accepting additional sponsors; if interested, please contact us at hello@aicrowd.com

Known problems

Since a few days, many previously working submissions appear to fail with build problems. This seems to be due to an update from an external dependency. See this thread: Build problem with the current `environment.yml` file. The new starter kit doesn’t have this issue.
One known problem with the new evaluation setting: if your submission crashes and never calls remote_client.submit(), your score won’t appear on the leaderboard. It’s something we are investigating. Tag @aicrowd-bot on your submission if this happens to you and we’ll requeue it when then problem is fixed (requeues don’t count as additional submissions).
More generally, if you made a Round 2 submission before today and you think it failed due to an evaluation bug, tag @aicrowd-bot and we will investigate/requeue it (if we haven’t already commented on it that it’ll be requeued).

MasterScrat · September 28, 2020, 10:34pm

MasterScrat · September 28, 2020, 10:34pm