Thanks for setting up this competition! I’ve just gotten set up and it looks interesting. A few quick questions:
Are the testing and validation buildings in phases 2 & 3 from the same location and the same year as the buildings in the training set? If so we have access to weather data for the test and validation sets from the whole year as part of the training data. Can we use this information? Perhaps this is somewhat moot as the provided predictions appear to be “perfect” (Weather Data "Predictions"), but weather data from the full year still contains more information than the predictions.
Can I ask what the reasoning for including the training data set as part of the evaluation criteria in phases 2 & 3 is? In theory, one could write code to “recognize” which buildings are part of the training set, and deploy a strategy optimized off-line for those buildings. Even if this is not done explicitly in the code, it might be done implicitly in learned parameter weights. It seems like it would be simpler to exclude the training set from the evaluation.
I’ll just add that I agree, it’s not standard practice to evaluate on the training data. Overfitting to the first five buildings should be avoided, not encouraged.
@kingsley_nweye What I’m not clear about is how the scoring weights are being applied. On the Overview tab it says
The train and validation dataset scores will carry 40% and 60% weights, respectively in the Phase 2 score.
Is this the calculation?
score1 = score on the training dataset
score2 = score on the validation dataset
final_score = 0.4 * score1 + 0.6 * score2
If so, then I agree that overfitting the training data is a problem. However, if the final_score is calculated in a way that encourages a model to coordinate the buildings in both the training and validation set, then I can see how there is still some value in including the training data in the evaluation. Naively overfitting the training data is probably not the best idea in that case.
Thanks for your question @noam_finkelstein. The location, weather and year remain the same across phases. The only thing that changes in each phase is the collection of building files; Building_n.csv since new buildings are added.
Thanks for your question @mt1. Please, take a look at this post regarding the addition of a coordination (grid) score and this post regarding the weighting of scores.