How to use Conda environment for your submission

Conda

Conda is an open source package management system and environment management system that runs on Windows, macOS and Linux.

Conda’s benefits include:

  • Providing prebuilt packages which avoid the need to deal with compilers or figuring out how to set up a specific tool.
  • Managing one-step installation of tools that are more challenging to install (such as TensorFlow or IRAF).
  • Allowing you to provide your environment to other people across different platforms, which supports the reproducibility of research workflows.
  • Allowing the use of other package management tools, such as pip, inside conda environments where a library or tools are not already packaged for conda.
  • Providing commonly used data science libraries and tools, such as R, NumPy, SciPy, and TensorFlow. These are built using optimized, hardware-specific libraries (such as Intel’s MKL or NVIDIA’s CUDA) which speed up performance without code changes.

Learning Conda

Syncing environment with remote

Create environment on your machine

The starter kits generally contain existing environment.yml file with which sample submission is tested out. You can start using the same environment to create one for your submission by:

conda env create -f environment.yml --name <environment-name>

But if you are feeling adventurous you can start with clean state by:

conda create --name <environment-name>

Export your environment to your submission

Once you have run your code and comfortable with your submission, you can simply export your conda environment in your repository root. You can do so via:

conda env export --no-builds | grep -v "prefix" > environment.yml

NOTES:

  1. If you are using conda on different platform like Mac or Windows, it is possible that your environment.yml export contains packages which are not required & not available for linux. Ex: clangxx_osx, gfortran_osx and so on. This can cause your submission to fail BUT you will have access to build logs and can get rid of such packages manually from above file.
  2. Pinning your packages is a good idea for more reproducible results.