I came upon a notebook which used the validation dataset along with the validation_ground_truth file, for early stopping the training of their algorithm. Are we allowed to use the validation dataset for training?
And since this might lead to overfitting issues, will our final scores (after the competition is over) be evaluated in a third dataset?
Yes, you can use the validation data for training. The final leaderboard will be based on the 40% of the hidden test set whose score are not available hence the chances of over fitting are quite less.
In other discussions, it was said that the final score is based on the current 60% + the hidden 40%. And now, you say it’s only based on the hidden 40% !!
So which one is correct?
Thanks