Really thank you for your exploring and sharing. And I have some comment which may be helpful for someone.
Exactly, I find that 450 driver can support all 11.x cuda. And if you only use pytorch, you don’t need to install cuda by yourself since pytorch has packed a cuda (that is why pytorch has a so large whl file).
Yes, it said that the 450 drivers support all Cuda 11. X like 11.4. But I test it first at 11.4 11.3 11.2 locally with 450 drivers. Only 11.0.03 works. So that what I suggest 11.03 is the highest one.
FROM nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04
RUN apt-get update && apt-get upgrade -y && apt-get install -y python3 python3-pip && apt-get clean && rm -rf /var/lib/apt/lists/*
RUN pip install pandas
RUN ln -s /usr/bin/python3 /usr/bin/python
COPY models/* /models/
COPY utils /usr/local/lib/python3.6/dist-packages/starter_kit/
The online submission environment requires running as a non-root user named aicrowd. If run as root, the system can pass public phase but will fail on private phase. I guess the system try to delete the file generated at public phase but fail due to no permission.
ENV USER aicrowd
ENV HOME /home/aicrowd
RUN groupadd --gid 1001 aicrowd
RUN useradd --comment “Default user” --create-home --gid 1001 --no-log-init --shell /bin/bash --uid 1001 aicrowd
USER aicrowd
And my repository structure is same as you. but I get the result that the build fails. Can you help me see what is the reason?
I tested your dockerfile locally. It said that pip is not found. This is an operating system special problem. For ubuntu 18.04, you need to use pip3 by default. when using pip3 install some packages. This problem is solved. I suggest that before submitting it, you should test it locally.
Hi,
When you test locally,
Does this command work? COPY utils /usr/local/lib/python3.6/dist-packages/starter_kit/
it says: COPY failed: file not found in build context or excluded by .dockerignore: stat usr/local/lib/python3.7/dist-packages/starter_kit/: file does not exist
Now, I guess it is because some packages are necessary but I do not exactly what are they. I guess some jupyter-notebook-related lib is needed here. since the original command (dockerfile) is based on repo2docker.
I am using the same one you used here in your examples. If I comment out COPY utils /usr/local/lib/python3.6/dist-packages/starter_kit/ and everything below this command, I can run it success in local.
No, I used docker to build images. But as I pip many packages which I guess might fit the requirement of the aicrowd. But I am not for sure which are necessary one.
There are multiple ways to add the repo path to the environment. I create the utils as a package with init.py and run.py.
utils: __init__.py other.py run.py
Then when I copy the whole utils into starter_kit then it will keep the same structure. Also, note that the docker copy command I used here does not keep the directory structure. So if you have a hierarchical file structure, then it will not keep the structure. You also need to use relative path import for this case.