# Singularity recommendations

## 1. Singularity configuration at Finis Terrae II

Several Singularity versions are currently installed at FT2 and available for all users through the module system. All versions share the same settings, detailed in this section, in order to provide transparent access to Singularity containers. Is important to remark that Singularity has been installed in a shared device, this implies to setup the localstatedir configuration argument during installation.

Singularity has been installed in privileged (SUID) mode in order to utilize all its features. Unprivileged containers, only in “user namespace” have limited features, so we are not using this configuration right now.

There are no limitations or restrictions on the users IDs or storage paths where to store Singularity images. This means that any user with read and execution permissions is able to use any Singularity image located in any directory.

Singularity has been configured to allow users to bind custom host directories inside the container. However, as FT2 kernel does not support the `overlay` feature, the destination directory of a bind-mount action must exist inside the container. If the destination directory does not exist inside the container the bind-mount will not succeed.

The Singularity installation is configured to automatically share some common linux directories and files from the host to the container. These directories are `/proc`, `/sys`, `/dev`, `/tmp`, and `/home`. Some files like `/etc/resolv.conf`, `/etc/hosts` and `/etc/localtime` are also automatically mounted inside the container to import host network and timing settings.

Finally, the current user and group info are automatically added into `/etc/passwd` and `/etc/groups` files. This means the user and group ids are exactly the same inside and outside the container.

## 2. Good practices using Singularity at Finis Terrae II

This section describes some instructions and good practices for building Singularity containers. These instructions provide some keys to build valid containers and avoid issues, making the containers completely transparent for users and also for the infrastructure, and easing the usage while dealing with Singularity and contained software at Finis Terrae II, but it can also be applied to other HPC systems.

The first rule is to provide a valid Singularity container. Within the container, an entire distribution of Linux or a very lightweight tuned set of packages can be included, preserving the usual Linux directories hierarchy. It is recommended to set up the required environment variables within the container in order to expose a consistent environment.

 FT2 kernel version is currently 2.32 (September 2018). Newest OS like Debian buster and Ubuntu Bionic require a more recent kernel and are not compatible with FT2 kernel.

To get transparent access to the host e-infrastructure storage (in this case we use Finis Terrae II as an example) from the containers, `/mnt` and `/scratch` directories/paths must exist within the container to be shared with the host. This allows the container to maintain the consistency with the host configuration, environment variables, etc.

Applications within a Singularity image must not be installed in any of the automatically mounted devices or directories. If this occurs, applications will be hidden for the end-user.

To run parallel applications using multiple nodes with MPI, the container must have installed support for MPI and PMI. Also for taking advantage of some HPC resources like Infiniband networks or GPUs, the container must support them. This means that the container must have installed the proper libraries to communicate with the hardware and also to perform inter-process communications.

• In the case of Infiniband support, there are not any known restrictions about the infiniband libraries installed inside the container.

• In the particular case of using GPUs from a Singularity container, the container must have `nvidia-smi` installed. Singularity provides GPU containers portability through the experimental NVidia support option to allow containers to automatically use the host drivers.

• Regarding MPI, in the current context, due to the Singularity hybrid MPI approach, it’s mandatory to use the same implementation and version of MPI installed at the host and inside the container to run MPI parallel applications, and also to use the corresponding `mpirun` or `mpiexec` launcher, instead of `srun` (Slurm default process manager), as process manager to ensure PMI compatibility. Both OpenMPI and IntelMPI implementations are supported and have been tested.

Other alternatives where explored to provide portability of MPI Singularity containers but are not recommended for its complexity or lack of generality. These alternatives are bind-mount of host MPI inside the container and the usage of PMIx > 2.1 as default process manager and OpenMPI >= 3.0.0.

This set of recommendations is summarized in two publicly available documents; the recommendations document itself [4] and the bootstrap definition templates [5].

## 3. Detailed recommendations for building Singularity containers

### 3.1. Storage devices

#### 3.1.1. Build

In order to get access to Finis Terrae II storage devices, we encourage you to check the existence of `/mnt` directory inside the container. If this directory doesn’t exists you can create it.

Although being a default directory in the linux OS hierarchy, we recommend to add the command `mkdir -p /mnt` to the `%post` section of your bootstrap definition file.

#### 3.1.2. Usage

To get a transparent access to Finis Terrae II storage devices while using Singularity, our recommendation is to bind-mount `/mnt` directory inside the container.

``````    $module purge$ module load singularity/$VERSION$ singularity exec -B /mnt $CONTAINER$COMMAND``````

### 3.2. Scratch directory

#### 3.2.1. Build

In order to get access to `/scratch` directories of the compute nodes, we encourage you to create this directory inside the container.

We recommend to add the command `mkdir -p /scratch` to the `%post` section of your bootstrap definition file.

#### 3.2.2. Usage

It’s recommended to use a `scratch` folder to store temporal data while executing in a Finis Terrae II compute node.

The most transparent strategy is to bind-mount the `/scratch` host directory into the `/scratch` container directory.

``````    $module purge$ module load singularity/$VERSION$ singularity exec -B /scratch $CONTAINER$COMMAND``````

Other alternative is to bind-mount the `/scratch` host directory in other temporal directory like `/tmp`.

 By default at Finis Terrae II, `/scratch` directory is the place to store the OpenMPI session. While launching MPI applications, if this directory doesn’t exists or it’s not writable, you must specify other using the `TMPDIR` environment variable.
``````    $module purge$ module load singularity/$VERSION$ export TMPDIR=/tmp
$singularity exec -B /scratch:/tmp$CONTAINER $COMMAND`````` ### 3.3. Shared-Memory containers #### 3.3.1. Build There isn’t any known particular restriction for running shared-memory applications from a Singularity container. #### 3.3.2. Usage ``````$ module purge
$module load singularity/$VERSION
$module load$COMPILER $MPI_VERSION$ module load singularity/$VERSION$ mpirun $ARGS singularity exec -B /scratch -B /mnt$CONTAINER $COMMAND`````` ### 3.5. GPU containers #### 3.5.1. Build In the particular case of using GPUs from a container, the contained NVidia driver must exactly match the NVidia driver installed at the host. There are several alternatives in order to have the right NVidia driver within the container. • Install it persistently inside the container. • Bind-mount the host driver inside the container. In both cases `nvidia-smi` must be installed inside the container.  The big con of a persistent installation is the lack of portability, as you cannot use the same container in other host with a different NVidia driver version. #### 3.5.2. Usage Singularity provides the `--nv` option to automagically bind-mount the NVidia drivers (experimental Nvidia support).  Please, ensure that you are in a GPU compute node to run your GPU containers. ``````$ module purge
$module load singularity/$VERSION
$mpirun singularity exec --nv -B /scratch -B /mnt$CONTAINER \$COMMAND``````

### 3.6. Sample recipes

Some templates stored in this github repository