Computation budget

Hi, is there some kind of computational budget for submissions? Or can we take as long as we want to compute the actions for each agent at each step / at each episode?

Hi @Lbliek

Thank you for your question. Please consult the FAQ section for more information about this.

Best regards,


Thank you, I missed that part apparently.

Hello @mlerik ,

What hardware are you using for submission evaluation (CPU cores, max allowed RAM etc.)?

I just noticed that the FAQ says that the attribute “next_malfunction” will be removed, “as it serves no purpose anymore”. It’s sad to make such changes when some solutions may be based on having this attribute present. It actually provides some useful information, allowing the agent to know exactly when its next malfunction will occur.

I also see an upper limit of 250 for the number of agents. In a separate thread (a while ago), this limit was mentioned to be 200. Which upper limit is correct?

Also, what’s the currently recommended way to generate local tests which resemble the ones used for scoring our submissions (in terms of parameter distributions) ? For Round 1 I was able to use the baselines repository, but parts of it haven’t been updated in a long time (and, in particular, I’m not sure if anything from there generates any kind of tests with stochastic malfunction data).

1 Like

Hi @mugurelionut

We are sorry for the inconvenience caused by the removal of next_malfunction. This was a leftover from early implementations of malfunctions and way mainly introduced to avoid divergence of environments on the service side from the client side. In the old implementation this value could be used for cheating as well, as you could ommit any agent with next malfunction value and only enter working agents into the environment.

In order to generate more realistic problems, where malfunctions arise stochasticly and you cannot predict them, we removed the next_malfunction parameter alltogether. We still allow for the malfunction duration to be read, this is also realistic as we usually have malfunction duration predictions.

Sorry for the confusion on the number of agents. Currently the upper limit is set to 200. This should remain the same for the whole challenge and only be updated for future challenges.

You can use the envs provided as explained in the starter kit for testing. I will also update the baselines repository to make use of the new parameters and put a good parameter set in the FAQ. Hope this helps.

Best regards,

Regarding cheating: Can’t malfunction_rate be used for the same purpose? It seems to be set to 0 for agents who never suffer any malfunction (so a plausible strategy, though not necessarily the one maximizing the fraction of done agents, would be to just enter these agents into the environment). Or will this parameter also go away? Or will it have a different meaning so that it’s non-zero also for agents who never suffer a malfunction?

Anyway, can I assume that by updating to the latest Flatland version (I am still using 2.1.8) I will see the latest changes? (i.e. at least I will stop getting the next_malfunction parameter).

Are the submissions made to Round 2 so far being reevaluated? I am guessing it’s possible that some of them relied on the presence of next_malfunction, so they should now stop working.

And maybe one last question about the malfunction duration. Can we assume that malfunctions are disjoint? (meaning that once the malfunction value is non-zero, the next malfunction can start only after the current malfunction ends)

Hi @mugurelionut

Your observation about the malfunction rate was correct. However, we introduced a different method to generate malfunctions from malfunction_generators. Thus just like the next_malfunction we don’t use malfunction_rate anymore for malfunctions.

In this new version, any agent can break so there is no point in leaving agents behind anymore.

For fairness all submissions in the leaderboard will be re-evaluated with these changes. This is also the reason why some variable are still present to keep code from crashing.

Yes when you update to the newest version of Flatland the new malfunction generators are in use. The FAQ has also been update to include these changes as well as explain the range of parameters and how to generate test files locally.

Your last assumption is wrong. They are not disjoint, the rate of malfunctions is low, but it can happen that multiple malfunctions occur at the same time. The malfunctions are indepentendt Poisson processes.

We are aware that this is a challenging task to solve, but that is unfortunately the reality of railway traffic where such problems have to be solved in real-time and where we can’t forsee when the next disturbance appears and how many we have at the same time. Thus the objective is to keep traffic as stable as possible even in presence of disturbances.

Hope this clarifies your issues.

Best regards,


There seems to be another change regarding malfunctions. In the previous Flatland version, the 1st malfunction only started once the agent entered the environment (otherwise the malfunction duration was not updated). This seems to not be the case anymore (meaning the malfunction duration, as well as new malfunctions, are updated also when the agent is still outside the environment). This also makes a big difference in terms of behavior.

Hi @mugurelionut

Thank you for pointing this out to us. We are sorry for the confusion that we caused and the lack of clear communication with the changes with malfunctions.

I try to shed some light on our thought process to explain the current malfunction behavior.

Agents malfunctioning before they enter the environment is equivalent to delayed departures of trains, which are quite common in real life and thus we need to model them in flatland as well. Updating the malfunciton variables before the trains enter the environment is done to simplify the task. If we were to update the malfunciton data only at the entrance of the train into the environment there would be no room for alternative solutions. By giving the controller access to delayed departures we hoped to go allow for more flexibility.

In the old malfunction model this was not necessary as we already gave information about when the next malfunction would occur, and thus planning was easier in advance of the events.

This environment is growing with important imput from the community such as yours. We apologize for the inconvenience that some of our changes cause along the way. But we truly believe that with all your great feedback and effort we have come quite far with Flatland.

We hope you all enjoy this challenging task as much as we do and help us develop the necessary tools to make public transport even better in the future.

Best regards,

The Flatland Team

Is 8 hour limit enforced? My submission 7 took 42 hours:

Do I understand correctly, that we roughly have 1minute( = 8h/200 ) per test?

Hi @alarih,

We were lenient with the 8 hours limit due to performance issues in flatland-rl library earlier. The limit has been re-enforced since 14-Nov (yesterday).