Submission failed during Inference

For my recent submissions, the inference has been failing randomly, the validation step passes without a problem. @dipam, could you please take a look at submission 237937 and let me know what might have gone wrong? I don’t suspect it’s a timeout issue, because the runtime of my model is around 900-1200 ms, and I’ve submitted it multiple times (and all of them failed).

@hca97 Have you tried adding default class and bbox with try and except blocks in extract_predicted_mosquito_class and extract_predicted_mosquito_bbox functions . After adding these, I am not getting these failed errors. Probably, these errors won’t be happening for validation data since model has already seen the data.

1 Like

Thanks for the suggestion, indeed I didn’t have any error handling. After I added some error handling my submissions still fail.

@dipam could you have a look at the following submissions. Submissions IDs: 238030, 238029

Hi @dipam, can you have a quick look at this? We really are stuck since Mosquitoalert Validation runs through but Mosquitoalert Prediction does not, and we do not have any informative error logs.

@hca97

The timeout on these submission does indeed cross 2 seconds for one instance. I believe these are your print statements?

Totla Time  2018.3565616607666 ms

Unfortunately we can’t relax the timing criteria, if possible try to reduce the time taken by your model.

@fkemeth Can you give me the submission ids where you get the failure and I’ll check if it’s due to timeout or something else.

@dipam the submission codes are

#238089 #238088 #238030 #238029

In the validation phase, we have inference times of around 1s. So I assume the reason is that there are higher-res images in the test data, could that be?

I don’t think so because we resize the images before passing them to the model. @dipam after how many iterations do we get a timeout? Is it the first prediction step?

@hca97 No it is not the first prediction, I see many print statements before this with varying time values.

1 Like

@dipam Could you please help to check the submission with id 238815? This submission fails for no reason. Everything seems normal in [debug-logs] files, with no exceptions and timeout.

@dipam can you also check submission #238838 ? It was failed in prediction stage with out any errors .

@saidinesh_pola @gavinwangshuo , both of these failed due to timeout.