Thanks, I have no problem.
@shivam @mohanty Iโm getting CUDA out of memory error when loading my model on pytorch, even though run.py works on my own T4 and V100. I tested it inside the same container that the Dockerfile builds. I donโt know what else do to at this point.
Hash: af5e3e9d5a515b6917e2d39340da51e23b23d878
@mohanty @shivam
we meet the problem that the environment isnโt configured well for 2 hours
Could you take a look at it
submit hash: 00d6a5fb492d8648d5cc1724ce7efcd79b3f532d
@shivam
i have no idea for this case error log๏ผcanโt see any error๏ผbut failed
AIcrowd Submission Received #193832 - initial-15
submission_hash : 77fd92f1686f89bb2a0a4a09ab2cb83cce5f3e0c
.
If this issue is not updated within a reasonable amount of time, please send email to help@aicrowd.com
.
Could you also check my submission? I believe there is an unusual behavior of some hosting services. The code passed the public test set and soon failed for the private set. I also observed other participantsโ submissions near the same time and all of them failed.
submission_hash : 540adaa2989b1c62dffc48659400db2cc0a13989
.
@shivam
itโs timeout for public test too๏ผ
have 2 process๏ผ
- data process spend 160s
๏ฟฝ๏ฟฝโโโโ | 240263/277044 [02:38<00:26, 1370.87it/s] - predict spend 27m
[Predict] 2/271 [โฆ] - ETA: 27:15
i have no idea for timeout.
@wac81 : Yes, the timeouts apply to both the Public and Private Test Phases. Also, we have increased the timeout to 120 mins
- please refer to this post : ๐ Deadline Extension to 20th July && โณ Increased Timeout of 120 mins
Best,
Mohanty
@wufanyou : The evaluation failed due to a timeout. The increased timeout of 120 minutes should fix this issue. We have re-queued your submission for re-evaluation.
Best,
Mohanty
Hi all, in case you feel your submission is running quite slow online v/s your local setup.
It might be a good idea to verify torch
or relevant packages are installed properly.
Here is an example for torch
:
In case you are confused how to verify for your package, please let us know and we can release relevant FAQs.
Hi @shivam and @mohanty. To debug this, I did the following: Iโm printing nvidia-smi
on my prediction_setup method, right before loading my model and it seems it gets executed 2 times, and thatโs why Iโm getting this CUDA out of memory error.
The first time it loads correctly but it canโt load a second time without releasing GPUโs RAM.
Any idea why itโs loading 2x?
Hi @vitor_amancio_jerony, thanks for the logs.
We have identified the bug during the evaluation phase which caused the models to load twice, and is now fixed.
We have also restarted your latest submission and monitoring if any similar error happens to it.
i passed all process for task2
but i am not see my result on leaderboards, any issues?
here is submission:
AIcrowd Submission Received #194023 - initial-19
submission_hash : 2cbf126dab5610c3c6b108cfb7e40028e59e5106
.
Hi @wac81,
We do see this submission on the leaderboard of Task-2. Maybe you were checking the leaderboard of a different task ?
Best,
Mohanty
yeah๏ผthis is my second submission which is show up, and i canโt first submission.
but itโs ok.
Hi @wac81, the leaderboard contains the best submission from each participants (not all the submissions). I assume that may have caused the confusion.
You can see all your submission here (they are merged for all the tasks in this challenge, but you can identify them using relevant repository name):
https://www.aicrowd.com/challenges/esci-challenge-for-improving-product-search/submissions?my_submissions=true
Best,
Shivam
Hi, @shivam and @mohanty
my last two submissions failed and the debug logs disappeared (i.e. link to the log is not displayed on the page), but debug logs before them are available. Could you take a look at it?
submission_hash : [bbf64209f663e4dac1dbddde443a9a7495a836f1
]( Files ยท bbf64209f663e4dac1dbddde443a9a7495a836f1 ยท caimanjing / task_1_query-product_ranking_code_starter_kit ยท GitLab (aicrowd.com)).
submission_hash : [efe0e19058c344b2b29e76387fd51497233d89d7
]( Files ยท efe0e19058c344b2b29e76387fd51497233d89d7 ยท caimanjing / task_1_query-product_ranking_code_starter_kit ยท GitLab (aicrowd.com)).
I just made a submission, but I donโt know why it has the same code as the last one, and their submission hashs are completely the same. Could you please help me cancel the second submission with this hash? I donโt want it to take up my submission chance. The name is #194735 - 7.16.2
.
submission_hash : 75fcd642d1b8659786cc66b7a77fa621d1fda1a0
.
hi @shivam and @mohanty, my submissions were failed and no logs were given again. All of them were stopped at 90 mins.
69047cec6f7f9fc2fc18a265895418972c8ae86b
527acda520e4d5221a8aa6b18574b0a50c6430ed
f18d54966cd711a22032d32ab09923264e235832