Using the validation dataset

stefanos_fafalios · May 25, 2021, 12:13pm

Hello,

I came upon a notebook which used the validation dataset along with the validation_ground_truth file, for early stopping the training of their algorithm. Are we allowed to use the validation dataset for training?

And since this might lead to overfitting issues, will our final scores (after the competition is over) be evaluated in a third dataset?

ashivani · May 26, 2021, 1:15am

Hi @stefanos_fafalios.

Yes, you can use the validation data for training. The final leaderboard will be based on the 40% of the hidden test set whose score are not available hence the chances of over fitting are quite less.

mohamed_chahhou · May 26, 2021, 11:06am

In other discussions, it was said that the final score is based on the current 60% + the hidden 40%. And now, you say it’s only based on the hidden 40% !!
So which one is correct?
Thanks

siddharth · May 26, 2021, 11:18am

@mohamed_chahhou host has changed the scoring rule, you can check this thread.