Hey, I’ve noticed submissions seem stuck again. It would be nice to get feedback soon, considering that there are only a few days left. Can someone look into this ?
Is that normal if my submission status is still ‘submitted’ after 12 hours?
Are submissions currently stuck? My submission has been waiting in queue for evaluation for over an hour (usually it’s a couple of minutes).
EDIT: the submission went through eventually.
EDIT 2: new submissions seem to be stuck again, this time frozen for over two hours currently.
15 hours and the submissions are still waiting in queue for evaluation.
@shivam @mohanty - is there a server-side issue that’s causing the stuck submissions?
Hi @simon_mezgec,
We had issue in submissions queue due to which submissions got stuck.
We have manually cleaned ongoing submissions – which got stuck and re-queued them now. (to be exact: 65632, 65262, 65404, 65411).
Please let us know in case any other submission ID is stuck for you.
Thanks @shivam!
My two submissions (65636 and 65637) got unstuck and finished successfully.
However, my new submission (65790) appears to also be stuck, so if you could get it unstuck as well, I would appreciate it.
Hi @simon_mezgec,
Sorry for the trouble. The submission 65790 is on it’s way to evaluation too now.
I will keep a close eye for the new submissions, to make sure this isn’t repeating.
@shivam Submission 65948 seems to have failed but I don’t think it should have (similarly to my two submissions yesterday). Can you check it out?
Thanks!
Hi @simon_mezgec, your submission has been processed properly now, and I have made post about the error here.
Fantastic - I figured it was some kind of system-wide error related to Docker. Thanks for sorting it out!
@shivam - encountered a new error (submission 66390) and I think it’s the server again.
By the way, sorry for pinging you here as well - didn’t know where you prefer it (here vs. GitLab).
Hi,
No worries. You can ping me at either place.
It isn’t happening due to server side this time.
The issue is happening when Dockerfile
is trying to install mmdetection
package. I think it is due to any new release of package it is dependent on (or similar). I am trying to debug it on my side and inform as soon as I find fix for your Dockerfile
.
https://gitlab.aicrowd.com/simon_mezgec/food-recognition-challenge-starter-kit/snippets/20588#L1854
Hi @simon_mezgec,
The issue is fixed now and you should be able to make submission. Please remember to pull latest commit from mmdetection starter kit.
Explaination:
This basically happened because mmcv
had a new release 0.5.2
~7 hours back from now.
And mmdetection
has requirement of/pinned to latest release of mmcv
Due to this mmdetection
installation start failing. I have pinned mmcv
version to 0.5.1
in starter kit now. https://gitlab.aicrowd.com/nikhil_rayaprolu/food-pytorch-baseline/commit/84eadc1ca353b5741423e0e1ea9f8db5d4bfd49f
Following this, submissions using this starter kit will go through as usual.
Thanks for notifying the issue to us!
Ah, interesting - good catch!
Uploaded another submission (66451) and the image was built successfully without a hitch upon adding the mmcv
version requirement like you suggested. Will re-upload the same model from submission 66390 later today.
Thanks a lot - really appreciate the quick fix for this!
@shivam - I think the submissions might be stuck again. My submission (67214) has been waiting in queue for evaluation for almost an hour now.
Evaluation for my submission is taking a long time. I had given debug=true. Can someone tell me if there is a problem in the inference script from my end. @shivam @nikhil_rayaprolu
My new submission (67274) is very slow as well - over one hour in the evaluation queue. There seems to be an issue with the submissions today.
Hi @simon_mezgec,
Your submissions 67274 went without any problem as far as I see. While, 67214 took longer because existing VMs were already busy in evaluating other submissions. We didn’t considering surge in submissions just before the Round end and I have increased parallel submissions to be evaluated (from 4 to 8) which should keep the queue clear.
I hope it helps.