What is Shifter?
Shifter is a tool to enable use of Docker containers on a Linux cluster.
What is Docker?
Docker is a platform that employs features of the Linux kernel to run software in a container. The software housed in a docker container is not just a single program but an entire OS distribution, or at least enough of the OS to enable the program to work.
Docker can be thought of as somewhat like a software distribution mechanism like yum or apt. It also can be thought of as an expanded version of a chroot jail, or a reduced version of a virtual machine.
Read more about Docker on the official web site.
Docker vs Shifter
There are some important differences between Docker and Shifter.
- Docker containers do not run directly via Shifter and must first be converted. This is done with the 'shifterimg' mechanism (see below).
- Docker containers usually run as root. However as a Melbourne Bioinformatics (formerly VLSCI) user you have a regular user account which does not have root privileges. Shifter allows for this by removing any elements of Docker containers which can only run as root. The resulting containers are therefore able to run as a regular user.
- Docker containers may be edited whilst they are running, but, because they do not run as root, Shifter containers may not. For instance if you run a Docker container of Ubuntu, you may use apt to install extra packages whilst the container is active. You cannot do this with Shifter, so you need to make sure the container you are using is complete.
Docker hub is a service run by Docker to distribute and share Docker containers.
A lot of scientific software & languages are available on docker hub, for example
Creating a Docker container
This guide does not cover how to build your own Docker containers. If you wish to do this, you could start with the Build your own image guide from the Docker web site. This guide provides an example of how to extend an existing Ubuntu-based image to add extra software.
You may use such a process to extend a container found on Docker Hub to add your extra requirements, such as Infiniband libraries for MPI (see below). Once your container is complete, you can upload it to Docker Hub in order to access it from Melbourne Bioinformatics.
Shifter at Melbourne Bioinformatics
Shifter has been installed on Melbourne Bioinformatics's Intel-based clusters.
To use it: you will need to load the Shifter module:
module load shifter
Once loaded there are two extra binaries added to your
shifterimg has three modes.
images will list the the docker containers - known as "images" - which have already been "pulled", ie. downloaded.
This will show entries like the following:
$ shifterimg images VLSCI docker READY 50475a1caf 2016-09-05T18:44:54 perl:latest VLSCI docker READY 95b04ce633 2016-09-05T19:15:10 r-base:latest VLSCI docker READY 65e1e9d1a1 2016-08-22T18:03:28 ubuntu:latest
The columns are, left to right:
- Image cache name. Melbourne Bioinformatics has one image cache, called "VLSCI".
- Image source. This shows the origin of the pulled image. In this example all the images listed are from the Docker Hub.
- State. This will say "READY" if the image is ready for use.
- Hash. This shows the abbreviated sha256 hash for the image, which is a unique identifier for the specific image and version.
- Timestamp. The time the image was last updated.
- Image. This shows the name of the image and its version (which can be "latest"), separated by a colon.
lookup will give the full 64-character sha256 hash for an image, or when used with the -v option, it will give further details about the image.
pull will pull an image from Docker Hub and store in in the VLSCI cache ready to use with shifter. For example, to pull the latest version of R:
shifterimg pull docker:r-base:latest
The 'docker:' prefix means you want the image to be pulled from the docker hub. This means you will be able to pull containers that have been published on the Docker (hub)[https://hub.docker.com/]. In future, other sources may be added.
When pulling an image you will see output like the following:
$ shifterimg pull docker:r-base:latest 2016-09-06T13:44:40 Pulling Image: docker:r-base:latest, status: PULLING
The last word will change from
CONVERSION and finally
An advantage of shifter for Melbourne Bioinformatics users is that you do not need to request the installation of software which has been published on Docker hub. You are free to pull it yourself.
If you want to run a container for inspection, you may do the following
shifter --image=<imagename>:<version> <entrypoint>
shifter --image=r-base:latest R
This will run the R binary from inside the image containing the latest version of R. You can then use R as normal:
$ module load shifter $ shifter --image=r-base:latest R R version 3.3.1 (2016-06-21) -- "Bug in Your Hair" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > cat("Hello world\n") Hello world > quit() Save workspace image? [y/n/c]: n $
However, Shifter is not intended for use on the login nodes. Just as you run regular jobs via SLURM, shifter jobs need to be run through SLURM to take advantage of the resources of Melbourne Bioinformatics's clusters.
Shifter and SLURM
shifter adds another option you may pass to slurm,
--image. This will inform SLURM that you want to use a particular image, so SLURM will ensure it is set up on the compute node(s) that execute your job.
Here is an example sbatch script, which we will call
#!/bin/bash #SBATCH --image=docker:r-base:latest #SBATCH --nodes=1 #SBATCH --partition=main module purge module load shifter echo 'cat("Hello world\n")' | shifter R --no-save
When submitted, we get:
$ sbatch Rshifter.sbatch Submitted batch job 729897
Once the job is complete we can see the output
$ cat slurm-729897.out R version 3.3.1 (2016-06-21) -- "Bug in Your Hair" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > cat("Hello world\n") Hello world >
So we have just successfully pulled
R from Docker Hub and used it inside a SLURM job.
MPI jobs in Shifter containers
To use MPI inside a Shifter container, the container must have Infiniband libraries installed.
If you are building a Debian or Ubuntu based container you can use the
libmlx4-1 package, which also requires the
For a Red Hat based container you can use the
Interaction with Melbourne Bioinformatics resources
Shifter containers run at Melbourne Bioinformatics are supplied with all Melbourne Bioinformatics user and group accounts. Processes started inside a container will be running as your user and that user will be a member of all your current groups.
Shifter containers mount the three Melbourne Bioinformatics filesystems, /scratch, /vlsci and /hsm, so all are available for any tasks you wish to perform inside a container. When you start an executable inside a container its working directory will be your home directory (/vlsci/
Executables running inside your container are able to read from and write to any files in those three filesystems for which you have read or write permission.
WARNING As all containers will mount the /vlsci, /hsm and /scratch directories, software inside any container may be able to destroy any data on these filesystems for which you have write permissions. Containers are pulled from Docker hub and we have no way to tell if they will destroy your data or not. A container which does destroy data may be malicious or simply contain bugs. Proceed with caution.
If you have any questions about shifter, the Melbourne Bioinformatics team is always able to help.