Hi, if the training does not finish in 2 hours is the submission considered as failed? Will the evaluation take the last checkpoint saved within the 2 hours?
The submission is considered as failed right now after 2 hrs timeout.
But I think it is fair request to be able to use last checkpoint in many scenarios. Let us check with team and revert back to you with decision on it.
Thanks for the quick reply! I added the time limit to the stop conditions and it worked ok now, e.g., time_total_s: 7200