#210059 failed with “Evaluation timed out”, but during 2/2 of validation dataset and 9/27 of test dataset “inference speed” was 0.926x and 0.913x respectively. But minimal inference speed is 0.637x, right?
After I submitted
#210079, my #210078 got stuck.
@dipam could you please look into this?
It happened to me as well, it timed out after 21/27 at 4.028x speed, for the exact same model as my previous successful submissions. But, I submitted the same model after waiting for a few hours and it was successful.
Today my new submissions were successfully evaluated, but unexpected failures waste attempts and can hinder the participants on the last day of the challenge.
I think a separate counter for unsuccessful submissions should be enabled.
February 16, 2023, 4:57am
I’ll investigate it, can I please have the respective submission ids.
Take a look at failed
#210059 and #210078 (there is no need to restart them). Failed #210078 and successful #210117 are identical.
#209954. No need to restart it either, but would be nice to know what happened.
I seem to be having a similar issue. Submission:
#210149 (music demixing leaderboard C)
Mine has been stuck at demixing
3/27 for 2 hours now. I also used up a submission since debug was
false in the
My submission also failed after 2.5hrs despite having an inference speed of 2.39x… it previously succeeded twice in a row, the only difference was that
debug was set to
false instead of
true within the
#210197 failed at 3/27, speed 0.769x
February 17, 2023, 6:34pm
@alina_porechina , Unfortunately I haven’t found what the issue is on our side. I’m looking into it and it should be resolved soon.
Can you let me know if you expect your model to be consistent in the speed for every song or it can vary? Since the evaluator is checking if every prediction is above the speed constraint.
My understanding is that the speed of my model is constant.
#210197 (2/2 - 0.771x, 3/27 - 0.769x) may have fallen due to low speed.
It is less likely that
#210059 fell due to low speed (2/2 - 0.926x, 9/27 - 0.913x).
#210078 and #210117 are identical. #210078 returned “Evaluation timed out” during validation phase. #210117 was evaluated quickly and successfully (2/2 - 3.369x, 27/27 - 3.479x).
I think it’s better to turn on the second counter instead of looking for the causes of rare crashes.
Today I successfully resubmitted code from failed
#210197 (2/2 - 0.771x, 3/27 - 0.769x) as #210246 (2/2 - 0.757x, 27/27 - 0.749x)
February 18, 2023, 7:48am
I’ve made some changes to how the cloud instances are provisioned for inference, I think now the failures shouldn’t occur. However if you do observe it again please let me know.
After demixing I got the message: “Scoring failed. Please contact the admins.”
After successful demixing (2/2 - 1.73x, 27/27 - 1.749x) I got the message: “Evaluation timed out”
Demixing got stuck (2/2 - 0.689x, 6/27 - 0.692x), then I got “Evaluation timed out”
Today 3/5 of my submissions failed unexpectedly
February 22, 2023, 3:20pm
I’ve rerun the submissions after adding an extra buffer to the timeout. They’re scored now.