Hi everyone, I would like to know how the mean reward and the mean normalized rewards are calculated for the evaluations.
You can checkout the information in flatland-rl documentation here. https://flatlandrl-docs.aicrowd.com/09_faq.html#how-is-the-score-of-a-submission-computed
The scores of your submission are computed as follows:
- Mean number of agents done, in other words how many agents reached their target in time.
- Mean reward is just the mean of the cummulated reward.
- If multiple participants have the same number of done agents we compute a “nomralized” reward as follows: … code-block:
normalized_reward =cumulative_reward / (self.env._max_episode_steps +self.env.get_num_agents()
The mean number of agents done is the primary score value, only when it is tied to we use the “normalized” reward to determine the position on the leaderboard.