Occurance for "Evaluation timed out 😢" in validation phases

Since timeouts occur frequently at validation timing, I would like to ask the following two questions.

The first question is whether different tracks have different validation timeouts.

The following three are submits of the same code to Tracks 1, 2, and 3, with Tracks 1 and 2 succeeding to the end. Track 3, on the other hand, fails with a validation timeout error every time, even after multiple submissions.

Is this because different tracks have different time limits allowed for validation?

The second question is, do the timeouts during validation occur during model loading, or is it because the inference is too slow for the sample data?

If it occurs during model loading, I don’t understand why it succeeds or fails depending on Track.

If the timeout occurs because inference is too slow, we would like to know how many seconds or more it takes to process one piece of data to cause an error.

2 Likes
  1. No. All tracks have the same validation time outs.

  2. In validation, we basically do the following stuff.

  • (S1) Download the model from git lfs (90MB/s)
  • (S2) Preparing the GPU server for your submission.
  • (S3) Run the validation set.

For all these 3 steps, S3 takes only several minutes so it won’t be a bottleneck. As far as what we can see, there may be two main reasons for ‘validation evaluation timeout’.

  1. Your repo is too large. If your repo is larger than 100GB, it will often timeout during model downloading.
  2. Our server fails to communicate with agents (i.e. your submission). This happens a lot recently, and aicrowd folks are looking into it.
2 Likes

Thank you for the detailed explanation! I now have a better understanding of the process flow.

I’ve been trying to follow the process you described, and now I’m getting successful submittals that I couldn’t get through before.