We are excited to announce that Round 2 is open for submissions!!
There are quite some changes included in this new round and few features were already announced with the release of Flatland 2.0 some of the main new features include:
Agents start outside the environment and have to actively enter to be on time
Agents leave the environment when their target is reached
Networks are much sparser an less paths are possible for each agent
Stochastic events stop agents and cause disruptions in the traffic network
Agents travel at different speeds
The baselines repository has been updated to incorporate these new changes and highlights how training can be implemented.
To make your submissions head over to the updated starter kit.
There is also a new Example file introducing the concepts of Flatland 2.1..
We are currently actively working to update all the documentation to the new changes so keep checking back or reaching out to us if there are any uncertainties.
We wish you all lots of fun with this new challenging round.
Please donβt forget to update your Flatland version to the newest version before submitting. Older versions of Flatland will lead to divergences between the client and the server.
We are currently working on a few more bug fixes, and performance and stability related issues with flatland. These fixes will be made available as another patch to the current latest release of 2.1.6.
Much of the delay in reliably accepting submissions has been due to tuning the complexity of the test environments on which your submissions will be evaluated. Please expect a further update from us soon.
But please be assured that no key changes in the features or the environment interfaces will be introduced at this stage, hence you can reliably continue experimenting with the flatland library at your end before we start accepting submissions again.
And our apologies for not being more communicative about the updates and announcements related to the competition. But rest assured, the whole team is working really hard to ensure you all can have a great experience taking part in the competition.
Thanks,
Mohanty
(on behalf of the organizing team)
I finally got a chance to look at the provided example and I have a few questions:
can we use env.agents in our code in order to get the current agentsβ positions, directions and targets? (like the example does) this seems much easier than somehow extracting them from observations (where they are encoded in some format)
do we indeed have access to so much malfunction information? (e.g. if an agent will ever malfunction or not, and when the next malfunction will occur?) this information is definitely useful and Iβd like to use it for making decisions, but I want to make sure we can indeed use it
if an agent is already malfunctioning, malfunction_data[βnext_malfunctionβ] seems to indicate how many steps after the end of the current malfunction the next malfunction will occur - this is not obvious from its name (I initially expected it to always be relative to the current time step, but thatβs not the case); is this intended?
if an agent is malfunctioning from the start and the agent doesnβt enter the environment (i.e. it remains in the READY_TO_DEPART state), the malfunction duration is not decreased - is this intended? given that the agent will be penalized for every time step when it remains outside the environment (before entering), it seems unexpected to not allow its malfunction duration to also βexpireβ while the agent is still outside the environment - so Iβm asking: is this intended?
And thanks for all the work put into preparing Round 2. It looks indeed much more interesting than Round 1.
We are happy to hear that you like the improved version of Flatland. To answer your questions in short:
Yes you can access all the information of the env and donβt have to reconstruct it from the observation. This was intended as we want participants to build their own observation builders and come up with clever ways to utilize the environment data.
Yes we allow accessing this information. We did not want to make to big of a leap from earlier rounds.
Yes next malfunction. Is the duration after the agent was repaired again. Sorry for the unclear naming and documentation. We will work on this.
This was actually intended but I do see the problem which arises from this. We are discussing how we want to address this issue to satisfy our requirements without introducing confusing behavior. We will update everyone on this issue if we change the behavior.
Hope these answers are helpful.
We are thankful for your in depth feedback and we will update our documentation to better clarify the adressed details.
I have one more question: Letβs assume there is an agent with speed less than 1 and that the agent is in the middle of performing a move (e.g. the agent has speed 0.25 and its position fraction is currently 0.5). And then a malfunction occurs for this agent at this time. What will happen to the agent once the malfunction ends?
Will the agent continue the move it started before the malfunction occurred?
Or will the agent be βresetβ (for lack of a better word) and will be able to start a new move as soon as the malfunction ends?
I was expecting case 1, but I encountered a case where I see the reported position_fraction being reset to 0 when a malfunction starts, and I donβt know if itβs just a reporting issue (i.e. the position_fraction is wrongly reported during malfunctions), or if itβs intended.
You are right in assuming that case 1 should occur. If you experienced case two, this would be a bug. It would be great if you could report this bug here: Issue Tracker
OK. Hereβs a concrete example I am encountering in a local test:
An agent with speed=0.333333 started moving at a previous time step. I am reading its data from env.agents and it says: position_fraction=0.333333 malfunction=0 next_malfunction=1
I call env.step(β¦). Obviously, this agent has no new action to do because itβs already involved in an ongoing move.
I read again the data from env.agents for this agent. It shows: position_fraction=0.333333 malfunction=10 next_malfunction=40
My expectation was that at step 3 the position_fraction should be 0.666666. Or I am just interpreting incorrectly the next_malfunction value? My interpretation is that as long as malfunction=0 and next_malfunction=1 then that agent still has one more time step of βusefulβ moving before being blocked by the malfunction (so the next env.step(β¦) should still do something useful for that agent, or, in other words, that the malfunction begins at the end of the next env.step(β¦) call, i.e. after one more useful move). This seems to not be the case.
Everything seems to behave as expected in the other cases (malfunction >= 1, or malfunction=0 and (next_malfunction>=2 or next_malfunction=0)), meaning that the position_fractions are advanced correctly.
I simulated further until the agentβs malfunction ends and it seems that the agent βexitsβ from the malfunction with the position_fraction that I was expecting it to have before the malfunction started (in this case: 0.666666). To give some concrete data for the same agent as before:
I read from env.agents the following data: position_fraction=0.333333 malfunction=1 next_malfunction=40
I call env.step(β¦)
I read from env.agents the following data: position_fraction=0.666666 malfunction=0 next_malfunction=40
So it seems that the move from position_fraction 0.333333 to 0.666666 is not βlostβ, but rather delayed. I guess itβs all caused by a different expectation of when malfunction is updated. From these examples, I guess malfunction is updated at the beginning of the env.step(β¦) call, while to me it seems more natural to have it updated at the end of env.step(β¦), so that:
malfunction >= 1 means the agent is blocked for that many env.step(β¦) calls (now it doesnβt mean that)
next_malfunction >= 1 means that there are that many env.step(β¦) calls left before the agent is blocked by the next malfunction (now it doesnβt mean that)
Is there any reason for the current behavior compared to the one Iβm expecting? Of course, now that I sort of reverse engineered the issue, I can work around it, but it still seems a bit unnatural to me.
Yes we do make the malfunction update at the beginning of the step see here.
I will update the issue accoringly and discuss this in the team if we want to change the malfunction behavior or just make documentation clearer on this point.
Hi. I downloaded the (small) set of tests mentioned in the started kit and used them to test my solution using the setup from the starter kit (redis server + flatland evaluator + the sample run.py tn which I integrated my solution). But it seems that the agents are not leaving the environment once they reach their destinations (I see their reported status is DONE, instead of DONE_REMOVED). Do I need to set any extra parameters when creating the local/remote environments? Or are these arguments part of the test data, and itβs just that the test data was generated without the option to have agents leave the environment?
Whatβs the status for the official test cases? Are the agents leaving the environment (as mentioned in this thread) or not?
This seems odd. Are you using the latest version of Flatland? SInce the latest version we have the default set to DONE_REMOVED. This is also the behavior that we have for the submissions.
If it will still not work please set remove_agents_at_target=True in the env constructor both in the client.py and service.py file.
But I think updating to Flatland version 2.1.8 should solve your problem.