@amapic: If you built the conda env drom the initial
environment.yml file, then
conda env export --no-build will export the updated state of the environment.
@amapic: If you built the conda env drom the initial
@mohanty I did so and I can’t find matching versions for those packages :
How to deal with it ?
@amapic This is happening as these packages are only available for linux distribution, due to while installing them in windows (I assume you are using windows) is failing. This is unfortunately a limitation currently with conda.
https://anaconda.org/anaconda/ncurses, have only osx & linux builds but not windows
In such scenario, I will recommend getting rid of above packages from
environment.yaml and continue your conda env creation. These packages are often included being dependencies of “main” dependencies, conda should resolve similar package for your system automatically.
@devops @shivam what does the timeout mean? Anyone knows where I can find this information. I have asked this question numerous times after @devops commented my failed subs but they were ignored so I am bring it up here.
How am I supposed to debug Timeout? Some of my successful subs took longer to execute than most of those which failed because of timeout. I couldn’t come up with reasonable explanation for such behaviour. I hope you can help me to understand this.
The submissions ideally should take few hours to run but we have put hard timeout as 8 hours. In case your solution is crossings 8 hours it is marked failed.
According to you how much time your code should run roughly? Is it way too off in local v/s during evaluation phase?
Otherwise you can include GPU (if not doing right now) to speed up computation and finish the evaluation under 8 hours.
Please let us know in case you require more help with debugging your submission. We can try to see which step/part of code is taking higher time if required.
I don’t manage to sub and I don’t have time left for this competition for the moment. Can you let the evaluation working after tthe 17 ? I would like to add a line on my resume about this competition.
Hi @amapic, let me get back on this after confirming with organisers.
Meanwhile we can create new questions instead of following up on this thread, it will make QnA search for future simpler.
How come some of my subs took 14h and didn’t fail if the limit is 8h? Then again, how am I supposed to know that timeout is set to 8h? Where is it written? I also thought for a moment that you keep changing the timeout limit? Can you confirm that this is not true?
inferencing time is way off. Locally my model on 1080ti takes ~10 minutes to execute so obviously it runs on CPU when submitted.
@amapic stay tuned for stage 4
@ValAn No, I can confirm the timeouts haven’t been change b/w your previous and current runs. The only issue has been timeout wasn’t implemented properly in past and it can be reason why your previous (1 week old) submission get missed from timeout.
We can absolutely check why it is taking >8 hours instead of ~10 minutes on local. Can you help me with following:
- The local run is with GPU? I can check if your code is utilising GPU (when allocated) or running only on CPU for whatsoever reason.
- What are the number of images when you are doing locally? The server/test dataset have 32428 images to be exact, which may be causing higher time.
I think specs for online environment would help a bit in case there is significant difference from your local environment: 4 vCPUs, 16 GB memory, K80 GPU (when enabled)
Also it seems you are constantly improving your platform, which is something great, but as a user I don’t know when you guys do it, nor what do you update, and when I get some inconsistency with previous experience I become nuts about which point I’m missing.
That being said, I think the timing issue is more related to the time for accessing the images in disk rather than the GPU time, which is kind of… sad.
I’ll definitely write more about that when this round is over.
Thanks for the suggestions.
I completely agree that we need to improve our communication & orientation of information for providing seamless experience to participants.
We would be glad to hear back from you after competition and looking forward for the inputs.
I checked all the submissions and unfortunately multiple participants are facing same issue i.e. GPU is being allocated but not used by submissions, due to cuda version mismatch.
For making GPU work out of box, we have introduced force installation as below in our snakes challenge evaluation process:
conda install cudatoolkit=10.0
This should fix the timing issues and we will continue monitoring all the submissions closely.
@ignasimg I have verified disks performance and it was good. Unfortunately on debugging, I found your submission faced same issue i.e. cudatoolkit=10.1 due to which it may have given the impression that disk is the bottleneck (but it was GPU which wasn’t being utilised). The current submission should finish much sooner after condatoolkit version pinning.
@shivam Thanks for your explanation. Do you know which day you force installation of cuda 10.0 ? It could explain some problem I had.
Hi @amapic, I have started force cudatoolkit=10.0 installation at same time above announcement is made i.e. 14 hours ago.
Edit: I remember the conda environment issue you were facing, and it isn’t related to it.
Congrats to all. It’s been 3 days since the competition finished and we haven’t got any info how we shall proceed from here. What happens next @devops?
Hi @ValAn, participants,
Congratulations to all for your participation.
There is no update right now. Organisers will be reaching out to the participants shortly with details about their travel grants, etc and post challenge follow-up.
thank you. Looking forward to it.
Best of luck to all and see you on the next challenge
@ValAn Congrats on the win
thanks, kudos to you too. See you at the next stage, I assume.
@ValAn Yeah, congrats !