We had mentioned in the competition’s Overview that as the competition progressed in phase, previous phase Average Score
would be weighted. Specifically:
-
By Phase II, the leaderboard will reflect the ranking of participants' submissions based on an unseen 5/17 buildings validation dataset as well as the seen 5/17 buildings dataset. The train and validation dataset scores will carry 40% and 60% weights, respectively in the Phase 2 score.
-
Finally in Phase III, participants' submissions will be evaluated on the 5/17 buildings training, 5/17 validation and remaining 7/17 test datasets. The train, validation and test dataset scores will carry 20%, 30% and 50% weights, respectively in the Phase 3 score.
However, this has not been the case yet.
How was average score weighted in Phase I?
In Phase I, the schema contained the 5-building train dataset and Average Score
was then calculated as:
How is average score weighted in Phase II?
According to the original description in the Overview page, Average Score
in Phase II was to be calculated as:
This meant that 2 simulations were to be run for each submission. 1 using 5-building train dataset
schema and another using 5-building train + 5-building validation dataset
schema with their Average Score
weighted 40%/60%. However, there are no weights currently applied and only 1 simulation that uses 5-building train + 5-building validation dataset
schema is being run. Hence, Average Score
in Phase II is calculated as:
This was error by the organizers and was not intentional. We will however, not make any changes yet since Phase II is under way to remain fair.
When will average score weighting begin?
Average Score
weighting will be applied from the beginning of Phase III. Please, see the next section for more information on Phase III weighting.
How will average score be weighted in Phase III?
In Phase III, the weighting of Average Score
will begin. There will be 2 leaderboards; a private and a public leaderboard.
Public leaderboard
The public leaderboard is what will be displayed here and Average Score
will be calculated as:
The highlight here is that it will only include buildings from the train and validation datasets. 2 simulations will be run and their Average Score
will be weighted 40%/60%. The first simulation will use 5-building train dataset
schema and the second will use 5-building validation dataset
schema. The 5-building train dataset
is excluded from the second simulation to avoid biasing the 5-building train dataset
that is made public and can be overfitted to.
Private leaderboard
The private leaderboard will be visible to only the organizers and will be used to decide the competition’s winners. It will only be made public at the time of announcing the winners. The Average Score
in the private leaderboard will be calculated as:
The highlight here is that it will include buildings from the train, validation and test datasets. 3 simulations will be run and their Average Score
weighted as 20%/30%/50%. The first simulation will use 5-building train dataset
schema, the second will use 5-building validation dataset
schema and the third will use 5-building test dataset
.