Extending Docker containers for use with Singularity
This document describes how to take an existing Docker image, modify it and import it for running via Singularity on the Melbourne Bioinformatics systems.
Obtaining a container from Docker hub
docker pull on your local system to obtain the image.
$ docker pull tensorflow/tensorflow [Using default tag: latest latest: Pulling from tensorflow/tensorflow c62795f78da9: Pull complete d4fceeeb758e: Pull complete 5c9125a401ae: Pull complete 0062f774e994: Pull complete 6b33fd031fac: Pull complete 52e18a0f2ca7: Pull complete cf26e7f79a1f: Pull complete f1d0b6192b60: Pull complete d3cca787fa7c: Pull complete 24b58a5e905f: Pull complete 4ed0083b7815: Pull complete f181e59dac06: Pull complete Digest: sha256:51755c628e1a853f91b0574555efa70f327ffdcd7366449f87fed0066c8ef1f3 Status: Downloaded newer image for tensorflow/tensorflow:latest]
You can see a list of images you have available by using
$ docker images [REPOSITORY TAG IMAGE ID CREATED SIZE tensorflow/tensorflow latest 2c520a260ba9 6 days ago 1.13GB ubuntu 16.04 0ef2e08ed3fa 7 weeks ago 130MB]
Running Docker containers
docker run, you can start a
$ docker run -it tensorflow/tensorflow bash root@e2177a597a86:/notebooks
docker ps will list the currently running containers.
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e2177a597a86 tensorflow/tensorflow "bash" 54 seconds ago Up 52 seconds 6006/tcp, 8888/tcp sharp_archimedes
Modifying a Docker container
If you would like to add new things into your docker image, you need to find the Dockerfile.
Here is the Dockerfile for tensorflow
You can use the following command to clone the tensorflow repository:
$ git clone https://github.com/tensorflow/tensorflow.git [Cloning into 'tensorflow'... remote: Counting objects: 176622, done. remote: Compressing objects: 100% (3/3), done. remote: Total 176622 (delta 0), reused 0 (delta 0), pack-reused 176619 Receiving objects: 100% (176622/176622), 94.13 MiB | 5.64 MiB/s, done. Resolving deltas: 100% (136073/136073), done. Checking connectivity... done.]
Modify the Dockerfile as needed. In this example, we have added the
h5py package in
$ cd tensorflow/tensorflow/tools/docker $ vim Dockerfile
In this case, we ask python3 to install h5py
$ diff Dockerfile Dockerfile.orig 36d35 <h5py==2.6.0 \
Once you've made edits to the Dockerfile, you can use
docker build to rebuild the image.
$ docker build --pull -t $USER/tensorflow/tensorflow -f Dockerfile .
Once the build is complete, you should be able to see an image called
$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE tensorflow/tensorflow latest 2c520a260ba9 6 days ago 1.13GB <you>/tensorflow/tensorflow latest 9710bc4c0841 10 days ago 1.14GB ubuntu 16.04 0ef2e08ed3fa 7 weeks ago 130MB]
Next, let's test if your changes have taken effect. In this example, we added the
h5py package in python3, so let's see if it's now present.
$ docker run -it <you>/tensorflow/tensorflow python3 Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import h5py >>>
This package wasn't present in the originally downloaded image:
$ docker run -it tensorflow/tensorflow python3 Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import h5py Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: No module named 'h5py'
Exporting a Docker container to Singularity format
Now that the Docker image is updated, we can export it to Singularity format. First, let's make sure the
/vlsci and '/scratch' directories are present in the image. This means those three filesystems, which are present on all Melbourne Bioinformatics clusters, will be able to be mounted inside your Singularity container when it is running.
$ docker run -it <you>/tensorflow/tensorflow bash [root@ddf2c817a9d3:/notebooks# cd / root@ddf2c817a9d3:/# mkdir hsm scratch vlsci root@ddf2c817a9d3:/# ]
Next, find the name of the running Docker image using
$ docker ps [CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ddf2c817a9d3 <you>/tensorflow/tensorflow "bash" 6 minutes ago Up 6 minutes 6006/tcp, 8888/tcp priceless_bell]
In this example the name of the running container is
priceless_bell. Export this image to a tarball:
$ docker export priceless_bell > tensorflow.tar
Next, create an empty singularity image. In this instance we'll create a 2048MB container. You may need a larger or smaller container depending on what you are putting in it.
$ /path/to/singularity create -s 2048 tensorflow.img [Creating a new image with a maximum size of 2048MiB... Executing image create helper Formatting image with ext3 file system Done.] $ /path/to/singularity import tensorflow.img tensorflow.tar
At this stage you should have a Singularity-format container,
tensorflow.img, which may be uploaded to a Melbourne Bioinformatics cluster and used for real jobs.
If you have any questions about Singularity, the Melbourne Bioinformatics team is always able to help.