OKοΌI well submit later.
@Erica : Can you please point to your submission hash (and corresponding issue) where you faced this issue ?
Thanks,
Mohanty
Hi, @mohanty
Here is my submission failure using dockerfile: 8a849b958dd0879b2b9f0b2e918708aae322738b
.
Here is another failure using requirement.txt file:
submission_hash : 2d7e4f3a8823501883a5229f4cd9a24a86442baf
.
Thanks,
Hi,
I have one more failure submission, the error came more than 90mins after submission: AIcrowd
submission_hash : be83058000cb609461713b44db84aa5e30cde867
.
Please help taking a look, thanks! @mohanty
I have the same error
@shivam @mohanty hi, i have a question about the image size or repo size, when i upload several models, building images seems to timeout and gets stuck at the copy step. what are the specific limitation on this? thanks
@shivam @mohanty Hi, I have a question about the number of submission. Is a failed submission (e.g., time out and image build error) counted as the limited number of submission?
Thanks.
Hi @shivam and @mohanty. Iβm getting errors on submission when it clones my repo. I made certain that the image has both git and git lfs
hash: 13895a4c74fcb52b5ecd7102d5d994977fe69754
remote: Compressing objects: 100% (29/29)
remote: Compressing objects: 100% (29/29), done.
Receiving objects: 3% (1/31)
Traceback (most recent call last):
File "/builder/image_builder/utils/git_utils.py", line 89, in clone_repo
repo_dir = Path(git_clone(git_uri, git_revision))
File "/builder/image_builder/utils/git_utils.py", line 65, in git_clone
Repo.clone_from(
File "/home/user/.local/lib/python3.9/site-packages/git/repo/base.py", line 1148, in clone_from
return cls._clone(git, url, to_path, GitCmdObjectDB, progress, multi_options, **kwargs)
File "/home/user/.local/lib/python3.9/site-packages/git/repo/base.py", line 1078, in _clone
handle_process_output(proc, None, to_progress_instance(progress).new_message_handler(),
File "/home/user/.local/lib/python3.9/site-packages/git/cmd.py", line 176, in handle_process_output
return finalizer(process)
File "/home/user/.local/lib/python3.9/site-packages/git/util.py", line 386, in finalize_process
proc.wait(**kwargs)
File "/home/user/.local/lib/python3.9/site-packages/git/cmd.py", line 502, in wait
raise GitCommandError(remove_password_if_present(self.args), status, errstr)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(-9)
cmdline: git clone -v --branch=submission-5-version --depth=1 --progress git@gitlab.aicrowd.com:vitor_amancio_jeronymo/task_1_query-product_ranking_code_starter_kit.git /home/user/.local/tmp/tmpjzrrupy_
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/builder/image_builder/__main__.py", line 8, in <module>
main()
File "/home/user/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/home/user/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/user/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/user/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/builder/image_builder/builder.py", line 59, in main
repo_dir = clone_repo(git_uri, git_revision)
File "/builder/image_builder/utils/git_utils.py", line 91, in clone_repo
raise ImageBuilderException(f"Couldn't clone repository.\n{e}")
image_builder.exceptions.ImageBuilderException: Image Builder Error: Couldn't clone repository.
Cmd('git') failed due to: exit code(-9)
cmdline: git clone -v --branch=submission-5-version --depth=1 --progress git@gitlab.aicrowd.com:vitor_amancio_jeronymo/task_1_query-product_ranking_code_starter_kit.git /home/user/.local/tmp/tmpjzrrupy_
Receiving objects: 6% (2/31)
Receiving objects: 9% (3/31)
Hi @vitor_amancio_jerony, your submission has been re-queued, and it shouldnβt encounter this error.
Edit: it failed again due to model size, Iβll update shortly with recommendation.
Hi @K1-O,
We have added a failed submissions count of 10 submissions/day (all tracks).
This means that upto 10 failed submissions wouldnβt be counted towards your daily limit of 5 submissions/day/track. We hope this helps in setting up your submission pipeline.
Best,
Shivam
Hi @qinpersevere,
We donβt want you to be restricted because of the submission limit.
Due to this, the submissions can have a maximum size of 10GB [to prevent abuse].
Along with it, the limit is only applied on the files present in your current tag (HEAD
i.e. part of the submission). You only need to remove any unwanted [or old] models from your HEAD
, without worrying about deleting them from git history, etc.
In case your usecase require more than 10GB, please let us know.
Best,
Shivam
cc: @xuange_cui, @TransiEnt, @ystkin, @masatoh
Me and @guilherme_rosa are using a 14GB model. Can we apply for more then?
Btw, I still got the same error when cloning the repo
hash: 13895a4c74fcb52b5ecd7102d5d994977fe69754 (same hash)
Can you check if weights can be pruned in your approach?
We are checking internally to increase the allowed submission/model size meanwhile, and will update you soon.
How are debug mode submissions counted?
Iβm thinking of another solution:
Am I allowed to process the inference on my own provided TPU? Not only it would be faster but the model would be in a cloud storage instead and wouldnβt even be loaded by the machine.
@vitor_amancio_jerony : this is a valid use case. We are trying to come up with a solution for this. You will be able to submit your model for evaluation.
I will keep you posted.
Best,
Mohanty
I donβt think this is a fair act. If there is a limit, then all participants should adhere to it. Allowing some participants to break the limit would be unfair to the rest, since we all know that bigger (and more) models perform better, to a certain extent.
I donβt think that my git clone error is related to the max submission size, though. I just tried to submit a 7GB repo and the same error happened. Here I assume it only clones whatever is in HEAD
hash: 588a2ed07f72ad2825a04da006b744aa735c1aa4
@TransiEnt : All the submissions will still get the same V100 GPU. At this point, the issue with the large model is not during the actual evaluations, but because our git servers throw a tantrum when a single large binary file is checked into a repository (via GIT-LFS). We are working on a fix for this, and that will equally affect all participants, and not change any of the resource constraints already announced for the round.
Best,
Mohanty
Submission failed : No participant could be found for this username
my submission was failed because of the reason above, I dont recal there was anything specific that we have to do for participating this, anybody know why?