For my recent submissions, the inference has been failing randomly, the validation step passes without a problem. @dipam, could you please take a look at submission 237937 and let me know what might have gone wrong? I don’t suspect it’s a timeout issue, because the runtime of my model is around 900-1200 ms, and I’ve submitted it multiple times (and all of them failed).
@hca97 Have you tried adding default class and bbox with try and except blocks in extract_predicted_mosquito_class and extract_predicted_mosquito_bbox functions . After adding these, I am not getting these failed errors. Probably, these errors won’t be happening for validation data since model has already seen the data.
Thanks for the suggestion, indeed I didn’t have any error handling. After I added some error handling my submissions still fail.
@dipam could you have a look at the following submissions. Submissions IDs:
Hi @dipam, can you have a quick look at this? We really are stuck since Mosquitoalert Validation runs through but Mosquitoalert Prediction does not, and we do not have any informative error logs.
The timeout on these submission does indeed cross 2 seconds for one instance. I believe these are your print statements?
Totla Time 2018.3565616607666 ms
Unfortunately we can’t relax the timing criteria, if possible try to reduce the time taken by your model.
@fkemeth Can you give me the submission ids where you get the failure and I’ll check if it’s due to timeout or something else.
@dipam the submission codes are
#238089 #238088 #238030 #238029
In the validation phase, we have inference times of around 1s. So I assume the reason is that there are higher-res images in the test data, could that be?
I don’t think so because we resize the images before passing them to the model. @dipam after how many iterations do we get a timeout? Is it the first prediction step?
@hca97 No it is not the first prediction, I see many print statements before this with varying time values.
@dipam Could you please help to check the submission with id 238815? This submission fails for no reason. Everything seems normal in [debug-logs] files, with no exceptions and timeout.
@dipam can you also check submission #238838 ? It was failed in prediction stage with out any errors .