Hi @animath3099, can you share the link to the issue page? We are looking into it.
AIcrowd Submission Failed (#1) · Issues · wac81 / Product Substitute Identification Starter Kit · GitLab
here thanks in advance
Hi @animath3099, the issue has been resolved. You can try to make a submission now.
Note: Please remove debug: true
from your aicrowd.json
, due to which no slots remaining error is coming.
Hi @mohanty and @shivam
Were you guys able to take a look at this error I got? This time I had used a 7GB repo and had the same error.
hash: 588a2ed07f72ad2825a04da006b744aa735c1aa4
Hi, @shivam and @mohanty
my last two submissions failed and the debug logs disappeared (i.e. link to the log is not displayed on the page), but debug logs before them are available. Could you take a look at it?
submission_hash : 03f98c0ca06db9537caefe022523b76ddcb32326
.
submission_hash : 56abf6cff5b1bb6c3402f0d660389fed5accdc78
.
Submission failed : No participant could be found for this username
and i pushed without debug line.
how to resolve?
Hi @wac81,
The issue is resolved and I noticed you were able to make the submission just now.
Please let us know in case you face any other issue.
The above problem has been solved, thank you @mohanty @shivam !
Now I have another problem. When my code was running at 92%, the submission suddenly failed. And no errors are shown in the debug log. Could you take a look?
submission_hash : 3c4a3a2939733c08fcadf8088e4928834f1e27df
.
Hi @zhichao_feng, your submissions have failed due to 90 minutes timeout for each public & private phase runs. (it got 92% in 90m)
Hi @vitor_amancio_jerony @qinpersevere & all,
The image building code has been updated on our side, and the repository size is no longer a restriction.
Please keep in mind that the resources available for inference are same for everyone i.e. Tesla V100-SXM2-16GB.
@shivam thanks, i saw my submission is init process for few hours, i can’t see any log for error,could you help me for check it?
here is submission:
AIcrowd Submission Received #193588 - 7
submission_hash : 3b8cea9ae65fb94418f0ec8b2a141b49971795c5
.
@shivam My submission took more than 90 minutes, but it was successful. Will it be counted as the final result? In addition, I found that the difference in submission time with the same amount of calculation will reach 2000 seconds. When two submissions are calculated at the same time, the calculation time will increase. Is there a GPU physically shared by multiple submissions when the same user submits a queue?
Hi @LYZD-fintech,
Each submission has dedicated compute resources.
The time elapsed, that is reported on the issue page is currently wrong and will be fixed soon, it shows the total time from the submission to completion (instead of start of execution of your code to the end). The timeout however is properly implemented and only considers the running time.
We provision a new machine dynamically for each submission, due to which the time elapsed might have been higher when there are a high number of submissions in the evaluation queue (multiple machines got provisioned)
I hope that clears any confusion.
Best,
Shivam
Thank you, but how do I know if my submission will be considered in the final ranking?
Hi @LYZD-fintech,
All the successful submissions would be considered for the final ranking in this challenge.
Best,
Shivam
Thanks, I have no problem.
@shivam @mohanty I’m getting CUDA out of memory error when loading my model on pytorch, even though run.py works on my own T4 and V100. I tested it inside the same container that the Dockerfile builds. I don’t know what else do to at this point.
Hash: af5e3e9d5a515b6917e2d39340da51e23b23d878