Observations for Flatland

#1

One of the main difficulties for this multi-agent problem is to define a suitable observation with allows for great scalability of the algorithms as well as generalizability across many problem instances.

We have provided you with three basic observations: Global, Local Grid and Tree Observation. And we encourage you to build on these or invet novel observations that help solve this problem.

This Thread is used to discuss different aspects, benefits or drawbacks of the provided observation and help us collaboratively come up with better solutions.

Here are the base observations provided by us: Observation builders

#2

Feel free to check out our baseline training code to get an understanding of how to use the different observations: Baselines

#3

Asked by Farahd:

I need some guidance from you regarding the observation part.

I’m trying to understand how the observation is constructed from the “ flatland.observation.get(handle) ”, but I have difficulties understanding it from the comments. I’m not sure if I can find more details/examples somewhere else.

Some update is needed to the comments, due to the changes from 5 features to 8 features.

Moreover, I think a little bit clarification in the terminology would be really helpful for the readers and the challenge participants

For Example #2:distance to a target of another agent is detected between the previous node and the current one .” à is the calculated value the distance to another agent’s target? What does exactly the previous node mean?

Is the Current node where the agent is located, or it is the node that we are getting the features for?

#4

Thank you for your feedback. I will try to clarify to comments on the observation builder to make it more understandable.

The distance is calculated from the current agents position. Thus, it tells the agent how far away the observed target of the other agent is from the current location of the observing agent.

The current node is the one being constructed in the tree observation. It is the next branching point along the agents path. I will update the comment soon and also implement an example on how to use it.

Also I will make the baselines Repo publicly available so you can see how to use the observation in a training setup.