Error when building Docker image in active submission

Hello,
I’m facing an issue with active submissions. The process fails in the Build Packages And Env with the following output:

#8 [ 3/18] RUN apt-get -qq update &&     apt-get -qq install --yes --no-install-recommends         locales         git         wget         curl         libcurl4-openssl-dev         libssl-dev         xz-utils         bzip2     > /dev/null &&     apt-get -qq purge &&     apt-get -qq clean &&     rm -rf /var/lib/apt/lists/*
#8 3.381 W: GPG error: https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY F60F4B3D7FA2AF80
#8 3.381 E: The repository 'https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release' is not signed.
#8 ERROR: process "/bin/sh -c apt-get -qq update &&     apt-get -qq install --yes --no-install-recommends         locales         git         wget         curl         libcurl4-openssl-dev         libssl-dev         xz-utils         bzip2     > /dev/null &&     apt-get -qq purge &&     apt-get -qq clean &&     rm -rf /var/lib/apt/lists/*" did not complete successfully: exit code: 100
------
 > [ 3/18] RUN apt-get -qq update &&     apt-get -qq install --yes --no-install-recommends         locales         git         wget         curl         libcurl4-openssl-dev         libssl-dev         xz-utils         bzip2     > /dev/null &&     apt-get -qq purge &&     apt-get -qq clean &&     rm -rf /var/lib/apt/lists/*:
#8 3.381 W: GPG error: https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY F60F4B3D7FA2AF80
#8 3.381 E: The repository 'https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release' is not signed.
------
Dockerfile:11
--------------------
  10 |     # Set up locales properly
  11 | >>> RUN apt-get -qq update && \
  12 | >>>     apt-get -qq install --yes --no-install-recommends \
  13 | >>>         locales \
  14 | >>>         git \
  15 | >>>         wget \
  16 | >>>         curl \
  17 | >>>         libcurl4-openssl-dev \
  18 | >>>         libssl-dev \
  19 | >>>         xz-utils \
  20 | >>>         bzip2 \
  21 | >>>     > /dev/null && \
  22 | >>>     apt-get -qq purge && \
  23 | >>>     apt-get -qq clean && \
  24 | >>>     rm -rf /var/lib/apt/lists/*
  25 |     
--------------------

Is anyone else having this problem? I’m not adding any new dependency, so I guess that the problem is not on my side, but I’m not sure.

Best regards and thank you in advance!

1 Like

Yes I get the same error when I run the evaluation with debug flag set to false in aicrowd.json.

Since 29th April I dont see any active submissions happening :sweat_smile: @shivam @jyotish

Hi everyone, @unnikrishnan.r, @jesusmolrdv,

The active submission pipeline is back and we have re-evaluated any failed submissions during this duration. :rocket:

What was the issue?
We were caught off-guard when nvidia key rotation happened, and the packages on the upstream used different key causing the error you shared above.

Why it impact your submissions?
It impacted the pipeline due to couple of reasons:

  1. Image Build Failures
    We use nvidia docker image as the base image, which haven’t been re-created recently. Due to this they contain older keys, causing apt to fail on apt -qq update. (and hence your image building process). We have fixed it in our internal Dockerfile to take care of this issue in all the future builds.
    Meanwhile, if you are using your own Dockerfile, please add following command before apt -qq update (or similar)
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
  1. GPU infrastructure (causing Preparing the cluster for you error)
    We internally use GPU machines on Kubernetes during the evaluation phase, which was impacted as well. The docker images have been updated and the pipeline is back now.

Wishing everyone all the best ! :wave:
Please let us know in case you face any issue with your future submissions OR your past one hasn’t started again yet
.

2 Likes

Thanks for the fix @shivam. Another issue I am facing is that on active submission I get the message " Submission failed : The participant has no submission slots remaining for today. Please wait until 2022-05-03 06:17:25 UTC to make your next submission." But I see that I have 6 submissions remaining today.

I assume you might be trying to make debug submissions.
Please use debug: false in your aicrowd.json.

1 Like

Thanks @shivam works now!

1 Like