preloader
  • Home
  • Solution comparison

Since 2005, several container solutions are emerging: Docker, Rkt, Singularity, uDocker, Shifter, CharlieCloud, Kata containers etc. Among these solutions, Docker has driven the community by providing a complete ecosystem: Docker daemon, client, Hub/private Registry, Compose, Swarm, Machine, etc. The Docker solution is evolving rapidly and it offers a user friendly environment for deploying micro-services. The micro-service concept has been quickly adopted by the industry and the web community due to the simple description of the system architecture. Furthermore, Docker containers can be run on multi-infrastructures such as bare metal system, local computers, cloud computing virtual machines (VMs) and on container clusters managed by an orchestrator.

The Docker’s paradigm is based on the representation of a container as a lightweight VM that should host a single service. But this idea of containers as miniature VMs is a wrong approach for the HPC community. Indeed, micro-services can not be used on computing clusters due to security constraints or working methods different from the typical Docker use case. For example, on computing clusters and particularly on supercomputers, the container’s permissions has to be set in the unprivileged mode similar to the user’s home on the computing infrastructure. This rule is broken for Docker containers where processes in the container can be accessed as root. Moreover, software solutions should be ready to deploy on existing infrastructure or should be deployed easily without interfering with existing systems. The Docker solution is based on a daemon and a client which can create an overhead and/or installation problems depending on the infrastructure. Furthermore, the software solutions must take advantage of the hardware capabilities and low level librairies (i.e. CUDA/ CUDnn for GPU, MPI, etc.). Finally, the Docker container format is not easily portable: scientists are looking for simple solutions to share, publish and ensure the reproducibility of codes and calculations. For all these reasons, Docker is not the optimal solution for executing scientific applications in an HPC environment. Fortunately, there are other solutions at this time that take advantage of the benefits of containerization, while adapting to the scientific environment.

Several criteria has been chosen in order to compare container solutions for HPC application:

  • Compatibility with Docker: as Docker is the first container solution already used in a large amount of projects, other technologies have to be compatible with Docker;
  • Security: allowing unprivileged mode;
  • I/O: transparency allowing a compatibility with MPI processes and X11 graphical export;
  • Scheduler: native integration with schedulers. Container images can be submitted and executed as a job on a computing cluster;
  • GPU: easy GPU integration in containers;
  • Mobility of computing: ability to define, create and maintain a workflow and be confident that the workflow can be executed on different hosts, operating systems (as long as it is Linux) and service providers. This characteristic can also be managed by Workflow Management Systems;
  • Single filesystem: containers run a single image file which is the complete representation of all the files within the container. This feature which facilitates mobility also facilitates the reproducibility of computing;

The conclusion of this evaluation has shown that the Singularity solution is the best compromise for the time being. The solution can be installed easily on local cluster and on computing center (as the CC-IN2P3, a High Throughput Computing (HTC) computing center) and can be used without privileged mode. Singularity is the most widely used container solution in HPC centers and it has also the larger user community: on supercomputer (original use case), on grid infrastructure (LCG), on HTC and HPC computing centers and on the cloud (in order to build Singularity images for Mac and Windows users). It is also fully compatible with Continuous Integration (CI) which is an important feature for a large community of researchers. It is important to note that all solutions are evolving quickly. In a close future, others technologies can become a good alternative to Singularity (see CharlieCloud for instance) and next solutions seem very promising due to the flexibility they will offer.