TWiki> Main Web>Computing>Conda (revision 2)EditAttach


Conda is a popular package management system. Here are some tips and tricks for using that system on the Nevis particle-physics systems.

Before you go to conda

One thing I've noticed is that users using conda to install packages that are already available in the Nevis environment modules. Before you use conda to have access to packages like numpy or matplotlib available, try using this command to load the Python version and libraries that I've installed on our applications server:

module load python

This will load the latest version of Python (3.10 as of May-2022) on the Linux cluster software libraries (use module avail python to see all the available versions). You can see which packages I've installed as part of our Python distribution with:

pip freeze

Take a look. You might be surprised by what I've included. As of May-2022, the list includes:

jupyter jupyterlab iminuit numpy scipy matplotlib pandas sympy terminado urllib3 pycurl tables rootpy rootkernel uproot tensorflow keras torch scikit-hep h5py astropy gammapy

Note that all the packages you see on that list are also available via our notebook server.

If you want a package that's not on the list, you can install it in your home directory with:

pip install --upgrade <package-name>

If a package is of sufficient interest that other users may want it, let WilliamSeligman know. He'll install it on the both the library server and the notebook server.

Note that this approach has some advantages:

  • It saves disk space. If every user starts installing entire software suites in their home directories, there's less space available for other work.

  • The library server directory /usr/nevis is available to every system in the Linux cluster, including the batch nodes. It might simplify the task of making necessary software packages available for your condor jobs.

Setting up conda

If you need a particular version of a package, or the software you need is not a Python-based package, then it makes sense to use conda.

You don't have to install conda (or miniconda or anaconda). The package is already installed on every system in the Nevis particle-physics Linux cluster.

What you may have to install for typical scientific research packages is conda-forge:

conda config --add channels conda-forge
conda config --set channel_priority strict

Your shell’s prompt will be changed by conda. Even when you’re not using conda, the text (base) will appear at the beginning of the prompt. If this doesn’t bother you, then ignore it. If it does, you can try:

conda config --set auto_activate_base false

You’ll have to log off then log in again to see the change. If you don’t want conda to alter your prompt even when you’re using an environment, this command will suppress conda’s prompt changes:

conda config --set changeps1 False

Conda environments by name

I strongly advise you to read the Conda documentation on managing environments. What follows are excerpts from that page that the most relevant to our work at Nevis.

You can create a conda environment in your home directory with a command like (note that the name jupyter-pyroot in the following examples is arbitrary):

conda create --name jupyter-pyroot jupyter python root

and add packages to it later; e.g.,:

conda install --name jupyter-pyroot jupyterlab numpy scipy matplotlib 

Note: The conda example packages above are not necessary if you're using the Nevis Python distribution as described at the top of this page. All the above-named packages are there.

To activate an environment and make its packages available to you:

conda activate --name jupyter-pyroot

Conda environments by location

Note that the approach in the above section is good enough if you're working on your laptop, but probably not the best approach working on the Nevis cluster:

  • It uses up space in your home directory. Even a small list of conda packages takes up a few gigabytes. If everyone in a group does this, the /home partition on your login server may be filled up with multiple copies of the exact same environment for each user.

  • If you want an environment accessible in a shared directory for all your condor jobs, you'll want a different method.

That method is to define an environment by location:

# Find some place in your file-server directory hierarchy with enough space to suit your needs.
# Note that there is no machine named 'olga' at Nevis
cd /nevis/olga/data/

# Create an appropriate directory if one doesn't already exist 
# Note that the name 'myenv' is arbitrary.
mkdir myenv

# Create the conda environment within that directory.
# Again, this list of packages is an example (and unnecessary!)
conda create --prefix ./myenv jupyter python root

From that point forward, you can activate the environment by including the directory:

conda activate /nevis/olga/data/myenv

Copying a conda environment

I'm going to assume a "use case": You've got a conda environment in your home directory, and you now want to move it to location-based environment instead of a name-based one.

# This is the default location of conda environments created by name
ls ~/.conda/envs 

# Create the new location of the environment if it's not already there.
# Again, all these names and locations are examples.
mkdir -p /nevis/olga/data/environments

# Assume the environment you want to copy is 'myenv'
(cd ~/.conda/envs; tar -cf - ./myenv) | (cd /nevis/olga/data/environments; tar -xvf -)

From after copying the environment, you can delete the old one:

conda remove --name myenv --all

If you used pip to install a python package with the conda environment, then it will be not be copied over by this procedure.

Conda and pip

Both conda and pip are package managers. It's important to recognize that they both different package managers.

It's common for python, as a package, to be included in a conda environment. If you're going to use pip within a conda environment, then the python package must be included in the conda environment. If you don't do this, the native CentOS 7 version of pip will be used, which works with an old version of python, and whose effects will not affect the conda environment.

It's best to stick with conda to install any packages. If a given package (e.g., biopython) is available via conda, it's better to use conda install biopython than pip install biopython. Once you use pip to install a package, only use pip to update that package or anything that depends on that package.

As a rule of thumb, once you use pip to modify a conda environment, stick with pip from then on.

Edit | Attach | Watch | Print version | History: r11 | r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2022-05-24 - WilliamSeligman
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback