This lesson is in the early stages of development (Alpha version)

Accessing software via Modules

Overview

Teaching: 30 min
Exercises: 15 min
Questions
  • How do we load and unload software packages?

Objectives
  • Understand how to load and use a software package.

On a high-performance computing system, it is seldom the case that the software we want to use is available when we log in. It is installed, but we will need to “load” it before it can run.

Before we start using individual software packages, however, we should understand the reasoning behind this approach. The three biggest factors are:

Software incompatibility is a major headache for programmers. Sometimes the presence (or absence) of a software package will break others that depend on it. Two of the most famous examples are Python 2 and 3 and C compiler versions. Python 3 famously provides a python command that conflicts with that provided by Python 2. Software compiled against a newer version of the C libraries and then used when they are not present will result in a nasty 'GLIBCXX_3.4.20' not found error, for instance.

Software versioning is another common issue. A team might depend on a certain package version for their research project - if the software version was to change (for instance, if a package was updated), it might affect their results. Having access to multiple software versions allow a set of researchers to prevent software versioning issues from affecting their results.

Dependencies are where a particular software package (or even a particular version) depends on having access to another software package (or even a particular version of another software package). For example, the VASP materials science software may depend on having a particular version of the FFTW (Fastest Fourier Transform in the West) software library available for it to work.

Environment Modules

Environment modules are the solution to these problems. A module is a self-contained description of a software package — it contains the settings required to run a software package and, usually, encodes required dependencies on other software packages.

There are a number of different environment module implementations commonly used on HPC systems: the two most common are TCL modules and Lmod. Both of these use similar syntax and the concepts are the same so learning to use one will allow you to use whichever is installed on the system you are using. In both implementations the module command is used to interact with environment modules. An additional subcommand is usually added to the command to specify what you want to do. For a list of subcommands you can use module -h or module help. As for all commands, you can access the full help on the man pages with man module.

On login you may start out with a default set of modules loaded or you may start out with an empty environment; this depends on the setup of the system you are using.

Listing Available Modules

To see available software modules, use module avail:

[yourUsername@vortex1[2] ~]$ module avail
------------------------------------------------- /util/common/modulefiles/Core -------------------------------------------------
   0.3.0                                gatk/4.0.3.0                      pybedtools/0.8.0                  (D)
   7zip/9.20.1                          gatk/4.0.9.0                      pylevy/1.1
   AlignGraph/0.0.0                     gatk/4.1.6.0                      pymc3/py37
   DosageConverter/1.1.0                gatk/2017-03-30-g34bd8a3   (D)    pyprocar/5.1.8
   EPACTS/3.2.6                         gcc/7.3.0-common                  pyrpipe/0.0.4
   FuSeq/1.1.1                          gcc/8.3.0-avx                     python/anaconda_no_mkl
   HLAreporter/103                      gcc/8.3.0                         python/anaconda-common
   HTSeq/0.6.1                          gcc/9.3.0                         python/anaconda-common-4.3.1
   HTSeq/0.11.1                  (D)    gcc/10.1.0-sse                    python/anaconda                   (D)
   MACS/1.4.2                           gcc/10.2.0-sse                    python/anaconda-4.4.0
   MACS/2.2.7.1                  (D)    gcc/10.2.0                 (D)    python/anaconda-5.0.0
   MetAMOS/v1.5rc3                      gcta/1.93.2                       python/anaconda-5.0.1-common
   MuMmer/3.23                          genometools/1.5.9                 python/anaconda-5.1.0-gpu

[removed most of the output here for brevity]

------------------------------------------------ /util/academic/modulefiles/Core ------------------------------------------------
   8.7-py36                                     gimsan                                       namd/2.12-ibverbs-smp-CUDA
   MACS/2-2.0.10                                glpk/4.55                                    namd/2.12-ibverbs-smp
   MACS2/2.1.0                                  gmap/2015.09.29                              namd/2.12-ibverbs
   MaCS/0.4                                     gmp/6.1.2                                    namd/2.12-multicore-CUDA
   R/3.0.0                                      gnu-parallel/2015.06.22                      namd/2.12-multicore
   R/3.1.2                                      gnuid/04.04.2013                             nast3dgp/2.08
   R/3.2.0                                      golang/1.4rc2                                nbo/6
   R/3.3.0                                      google-api/09.24.2014                        ncbi/blast-2.2.29
   R/3.3.2                                      google-api/11.06.2014                 (D)    ncbi/2.5.0-1                   (D)
   R/3.4.1                                      gpaw/1.1.0                                   ncl/ncarg-6.3.0
   R/3.5.1-mpi                                  grass/7.0.0                                  nco/4.6.1
   R/3.5.1-nonstandard-gcc                      grass/7.0.4                                  netcdf/4.3.3.1
   R/3.5.1                                      grass/7.2.0                                  netgen/6.0
   R/3.5.2                                      grass/7.2.2                           (D)    nexmd/intel-mkl
   R/3.5.3                                      gromacs/4.5.5                                nibabel/2.2.1


[removed most of the output here for brevity]

   D:        Default Module

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".

Listing Currently Loaded Modules

You can use the module list command to see which modules you currently have loaded in your environment. If you have no modules loaded, you will see a message telling you so

[yourUsername@vortex1[2] ~]$ module list
No Modulefiles Currently Loaded.

Loading and Unloading Software

To load a software module, use module load. In this example we will use Python.

Initially, Python 3 is not loaded. We can test this by using the which command. which looks for programs the same way that Bash does, so we can use it to tell us where a particular piece of software is stored.

In the case of CCR’s systems, there is a version of Python 2 and Python3 on most systems; however, the versions installed on the head nodes may be different than those installed on the compute nodes or may not be the version you need. We can test this by using the which command and the ‘version’ option of the python command. which looks for programs the same way that Bash does, so we can use it to tell us where a particular piece of software is stored. Using the ‘–version’ option will tell us what version is the default.

[yourUsername@vortex1[2] ~]$ which python3
/usr/bin/python3
[yourUsername@vortex1[2] ~]$ python3 --version
Python 3.6.8

Let’s see what other versions of python are available:

[yourUsername@vortex1[2] ~]$ module avail python
------------------------------------------------- /util/common/modulefiles/Core -------------------------------------------------
   bioconda/py37-biopython               python/py27-anaconda-5.3.1      python/py37-anaconda-2019.07
   python/anaconda_no_mkl                python/py27-anaconda-2018.12    python/py37-anaconda-2019.10
   python/anaconda-common                python/py27-anaconda-2019.03    python/py37-anaconda-2020.02
   python/anaconda-common-4.3.1          python/py27-anaconda-2019.10    python/py38-anaconda-2020.07
   python/anaconda              (L,D)    python/py36-anaconda-5.2.0      python/py38-anaconda-2020.11
   python/anaconda-4.4.0                 python/py36-anaconda-5.3.1      python/py38-anaconda-2021.05
   python/anaconda-5.0.0                 python/py37-anaconda-5.3.1      vhub/python
   python/anaconda-5.0.1-common          python/py37-anaconda-2018.12
   python/anaconda-5.1.0-gpu             python/py37-anaconda-2019.03

------------------------------------------------ /util/academic/modulefiles/Core ------------------------------------------------
   biopython/1.70          python/anaconda-4.3.1      python/anaconda-5.2.0     python/anaconda2-4.2.0
   issm/4.15-python        python/anaconda-5.0.0.1    python/anaconda2-2.4.1    python/anaconda2-4.3.0
   openslide/python-ocv    python/anaconda-5.0.1      python/anaconda2-4.1.1

  Where:
   L:  Module is loaded
   D:  Default Module

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

If we need a different version of python we can load it with module load:

[yourUsername@vortex1[2] ~]$ module load python/py36-anaconda-5.2.0
[yourUsername@vortex1[2] ~]$ which python3
/util/common/python/py38/anaconda-2021.05/bin/python3

So, what just happened?

To understand the output, first we need to understand the nature of the $PATH environment variable. $PATH is a special environment variable that controls where a UNIX system looks for software. Specifically $PATH is a list of directories (separated by :) that the OS searches through for a command before giving up and telling us it can’t find it. As with all environment variables we can print it out using echo.

[yourUsername@vortex1[2] ~]$ echo $PATH
/util/common/python/py36/anaconda-5.2.0/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/lpp/mmfs/bin:/opt/dell/srvadmin/bin:/user/yourUsername/bin

You’ll notice a similarity to the output of the which command. In this case, there’s only one difference: the different directory at the beginning. When we ran the module load command, it added a directory to the beginning of our $PATH. Let’s examine what’s there:

[yourUsername@vortex1[2] ~]$ ls /util/common/python/py36/anaconda-5.2.0/bin
2to3                                freetype-config           libtool               qmltestrunner
2to3-3.6                            gapplication              libtoolize            qscxmlc
activate                            gdbus                     linguist              qtattributionsscanner
anaconda                            gdbus-codegen             list_instances        qt.conf
anaconda-navigator                  genbrk                    lrelease              qtdiag
anaconda-project                    gencfu                    lss3                  qtpaths
asadmin                             gencnval                  lupdate               qtplugininfo
assistant                           gendict                   lzcat                 qwebengine_convert_dict
binstar                             genrb                     lzcmp                 raw2tiff
blaze-server                        gif2h5                    lzdiff                rcc
bokeh                               gio                       lzegrep               rdjpgcom
bundle_image                        gio-querymodules          lzfgrep               repc
bunzip2                             glacier                   lzgrep                reset
bzcat                               glib-compile-resources    lzless                route53
bzcmp                               glib-compile-schemas      lzma                  rst2html4.py
bzdiff                              glib-genmarshal           lzmadec               rst2html5.py
bzegrep                             glib-gettextize           lzmainfo              rst2html.py
bzfgrep                             glib-mkenums              lzmore                rst2latex.py
bzgrep                              gobject-query             makeconv              rst2man.py
bzip2                               gresource                 moc                   rst2odt_prepstyles.py
bzip2recover                        gsettings                 mturk                 rst2odt.py
bzless                              gst-device-monitor-1.0    navigator-updater     rst2pseudoxml.py
bzmore                              gst-discoverer-1.0        ncursesw6-config      rst2s5.py
cairo-trace                         gst-inspect-1.0           nosetests             rst2xetex.py
canbusutil                          gst-launch-1.0            numba                 rst2xml.py
captoinfo                           gst-play-1.0              odbc_config           rstpep2html.py
cfadmin                             gst-stats-1.0             odbcinst              runxlrd.py
chardetect                          gst-typefind-1.0          odo                   s3put
cjpeg                               gtester                   openssl               samp_hub
clear                               gtester-report            pal2rgb               sdbadmin
conda                               h52gif                    pandoc                showtable
conda-build                         h5c++                     pandoc-citeproc       sip
conda-convert                       h5cc                      pango-view            skivi
conda-develop                       h5clear                   patchelf              slencheck
conda-env                           h5copy                    pcre-config           sphinx-apidoc
conda-index                         h5debug                   pcregrep              sphinx-autogen
conda-inspect                       h5diff                    pcretest              sphinx-build
conda-metapackage                   h5dump                    pep8                  sphinx-quickstart
conda-render                        h5fc                      pip                   spyder
conda-server                        h5format_convert          pixeltool             sqlite3
conda-skeleton                      h5import                  pkgdata               sqlite3_analyzer
conda-verify                        h5jam                     pkginfo               symilar
cq                                  h5ls                      pngfix                syncqt.pl
c_rehash                            h5mkgrp                   png-fix-itxt          tabs
curl                                h5perf_serial             ppm2tiff              taskadmin
curl-config                         h5redeploy                pt2to3                tclsh
curve_keygen                        h5repack                  ptdump                tclsh8.6
cwutil                              h5repart                  ptrepack              tic
cygdb                               h5stat                    pttree                tiff2bw
cython                              h5unjam                   pyami_sendmail        tiff2pdf
cythonize                           h5watch                   pybabel               tiff2ps
dask-mpi                            hb-ot-shape-closure       pycc                  tiff2rgba
dask-remote                         hb-shape                  pycodestyle           tiffcmp
dask-scheduler                      hb-subset                 pydoc                 tiffcp
dask-ssh                            hb-view                   pydoc3                tiffcrop
dask-submit                         icu-config                pydoc3.6              tiffdither
dask-worker                         icuinfo                   pyflakes              tiffdump
dbus-cleanup-sockets                idle3                     pygmentize            tiffinfo
dbus-daemon                         idle3.6                   pylint                tiffmedian
dbus-launch                         infocmp                   pylint-gui            tiffset
dbus-monitor                        infotocap                 pylupdate5            tiffsplit
dbus-run-session                    instance_events           pyrcc5                toe
dbus-send                           iptest                    pyreverse             tput
dbus-test-tool                      iptest3                   pytest                tset
dbus-update-activation-environment  ipython                   py.test               uic
dbus-uuidgen                        ipython3                  python                unlzma
deactivate                          isort                     python3               unxz
derb                                isql                      python3.6             vba_extract.py
designer                            isympy                    python3.6-config      volint
djpeg                               iusql                     python3.6m            wcslint
dltest                              jlpm                      python3.6m-config     wheel
dynamodb_dump                       jpegtran                  python3-config        wish
dynamodb_load                       jsonschema                pyuic5                wish8.6
easy_install                        jupyter                   pyvenv                wrjpgcom
elbadmin                            jupyter-bundlerextension  pyvenv-3.6            xml2-config
epylint                             jupyter-console           qcollectiongenerator  xmlcatalog
f2py                                jupyter-kernel            qdbus                 xmllint
fax2ps                              jupyter-kernelspec        qdbuscpp2xml          xmlpatterns
fax2tiff                            jupyter-lab               qdbusviewer           xmlpatternsvalidator
fc-cache                            jupyter-labextension      qdbusxml2cpp          xmlwf
fc-cat                              jupyter-labhub            qdoc                  xslt-config
fc-list                             jupyter-migrate           qgltf                 xsltproc
fc-match                            jupyter-nbconvert         qhelpconverter        xz
fc-pattern                          jupyter-nbextension       qhelpgenerator        xzcat
fc-query                            jupyter-notebook          qlalr                 xzcmp
fc-scan                             jupyter-qtconsole         qmake                 xzdec
fc-validate                         jupyter-run               qml                   xzdiff
fetch_file                          jupyter-serverextension   qmlcachegen           xzegrep
fits2bitmap                         jupyter-troubleshoot      qmleasing             xzfgrep
fitscheck                           jupyter-trust             qmlimportscanner      xzgrep
fitsdiff                            kill_instance             qmllint               xzless
fitsheader                          launch_instance           qmlmin                xzmore
fitsinfo                            lconvert                  qmlplugindump
fixqt4headers.pl                    libpng16-config           qmlprofiler
flask                               libpng-config             qmlscene

Taking this to its conclusion, module load will add software to your $PATH. It “loads” software. A special note on this - depending on which version of the module program that is installed at your site, module load will also load required software dependencies.

To demonstrate, let’s use module list. module list shows all loaded software modules.

[yourUsername@vortex1[2] ~]$ module list
Currently Loaded Modules:
  1) python/py36-anaconda-5.2.0

[yourUsername@vortex1[2] ~]$ module load beast
[yourUsername@vortex1[2] ~]$ module list
Currently Loaded Modules:
  1) python/py36-anaconda-5.2.0   2) java/1.8.0_152   3) beast/1.10.4

So in this case, loading the beast module (a bioinformatics software package), also loaded java/1.8.0_152 as well. Let’s try unloading the beast package.

[yourUsername@vortex1[2] ~]$ module unload beast
[yourUsername@vortex1[2] ~]$ module list
Currently Loaded Modules:
  1) python/py36-anaconda-5.2.0

So using module unload “un-loads” a module along with its dependencies. If we wanted to unload everything at once, we could run module purge (unloads everything).

[yourUsername@vortex1[2] ~]$ module purge

Note that module purge is informative. It will let us know if any modules were not unloaded and how to actually unload these if we desire.

Software Versioning

So far, we’ve learned how to load and unload software packages. This is very useful. However, we have not yet addressed the issue of software versioning. At some point or other, you will run into issues where only one particular version of some software will be suitable. Perhaps a key bugfix only happened in a certain version, or version X broke compatibility with a file format you use. In either of these example cases, it helps to be very specific about what software is loaded.

Let’s examine the output of module avail more closely.

[yourUsername@vortex1[2] ~]$ module avail
------------------------------------------------- /util/common/modulefiles/Core -------------------------------------------------
   0.3.0                                gatk/4.0.3.0                      pybedtools/0.8.0                  (D)
   7zip/9.20.1                          gatk/4.0.9.0                      pylevy/1.1
   AlignGraph/0.0.0                     gatk/4.1.6.0                      pymc3/py37
   DosageConverter/1.1.0                gatk/2017-03-30-g34bd8a3   (D)    pyprocar/5.1.8
   EPACTS/3.2.6                         gcc/7.3.0-common                  pyrpipe/0.0.4
   FuSeq/1.1.1                          gcc/8.3.0-avx                     python/anaconda_no_mkl
   HLAreporter/103                      gcc/8.3.0                         python/anaconda-common
   HTSeq/0.6.1                          gcc/9.3.0                         python/anaconda-common-4.3.1
   HTSeq/0.11.1                  (D)    gcc/10.1.0-sse                    python/anaconda                   (D)
   MACS/1.4.2                           gcc/10.2.0-sse                    python/anaconda-4.4.0
   MACS/2.2.7.1                  (D)    gcc/10.2.0                 (D)    python/anaconda-5.0.0
   MetAMOS/v1.5rc3                      gcta/1.93.2                       python/anaconda-5.0.1-common
   MuMmer/3.23                          genometools/1.5.9                 python/anaconda-5.1.0-gpu

[removed most of the output here for brevity]

------------------------------------------------ /util/academic/modulefiles/Core ------------------------------------------------
   8.7-py36                                     gimsan                                       namd/2.12-ibverbs-smp-CUDA
   MACS/2-2.0.10                                glpk/4.55                                    namd/2.12-ibverbs-smp
   MACS2/2.1.0                                  gmap/2015.09.29                              namd/2.12-ibverbs
   MaCS/0.4                                     gmp/6.1.2                                    namd/2.12-multicore-CUDA
   R/3.0.0                                      gnu-parallel/2015.06.22                      namd/2.12-multicore
   R/3.1.2                                      gnuid/04.04.2013                             nast3dgp/2.08
   R/3.2.0                                      golang/1.4rc2                                nbo/6
   R/3.3.0                                      google-api/09.24.2014                        ncbi/blast-2.2.29
   R/3.3.2                                      google-api/11.06.2014                 (D)    ncbi/2.5.0-1                   (D)
   R/3.4.1                                      gpaw/1.1.0                                   ncl/ncarg-6.3.0
   R/3.5.1-mpi                                  grass/7.0.0                                  nco/4.6.1
   R/3.5.1-nonstandard-gcc                      grass/7.0.4                                  netcdf/4.3.3.1
   R/3.5.1                                      grass/7.2.0                                  netgen/6.0
   R/3.5.2                                      grass/7.2.2                           (D)    nexmd/intel-mkl
   R/3.5.3                                      gromacs/4.5.5                                nibabel/2.2.1


[removed most of the output here for brevity]

   D:        Default Module

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".

Let’s take a closer look at the gcc module. GCC is an extremely widely used C/C++/Fortran compiler. Tons of software is dependent on the GCC version, and might not compile or run if the wrong version is loaded. In this case, there are many different versions - for example gcc/9.3.0 and gcc/10.2.0. How do we load multiple copies and which copy is the default?

In this case, gcc/10.2.0 has a (D) next to it. This indicates that it is the default — if we type module load gcc, this is the copy that will be loaded.

[yourUsername@vortex1[2] ~]$ module load gcc
[yourUsername@vortex1[2] ~]$ gcc --version
gcc (GCC) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

So how do we load the non-default copy of a software package? In this case, the only change we need to make is be more specific about the module we are loading. To load a non-default module, the only change we need to make to our module load command is to leave in the version number after the /.

[yourUsername@vortex1[2] ~]$ module load gcc/8.3.0
[yourUsername@vortex1[2] ~]$ gcc --version
The following have been reloaded with a version change:
  1) gcc/10.2.0 => gcc/8.3.0

We now have successfully switched from GCC 10.2.0 to GCC 8.3.0.

Using Software Modules in Scripts

Create a job that is able to run a version of Anaconda Python36 and output python3 --version. Running a job is just like logging on to the system. (you should not assume a module loaded on the login node is loaded on a compute node).

Solution

[yourUsername@vortex1[2] ~]$ nano python-module.sh
[yourUsername@vortex1[2] ~]$ cat python-module.sh
#!/usr/bin/env bash

module load python/py36-anaconda-5.2.0

python3 --version
[yourUsername@vortex1[2] ~]$ sbatch python-module.sh

Key Points

  • Load software with module load softwareName.

  • Unload software with module purge

  • The module system handles software versioning and package conflicts for you automatically.