Bioconda, simplifying software installation for bioinformaticians and life scientists
25 October 2017
Thanks to the global efforts of over 250 contributors, including our own Simon Gladman, bioinformaticians and life scientists now have access to ‘Bioconda’, a software-package building and management system designed for bioinformatics. This work is now documented at BioRxiv.
A common problem in computing and data science especially – known as ‘dependency hell’ – occurs when you try to install software you want to run and it’s not compatible with your operating systems, versions, system set-ups etc. This creates an environment where the compilation requirements of the underlying systems are often competing with one another. In more mature fields of computer programming, packaging systems like Conda have been developed to overcome this problem: someone makes their software available using a ‘Conda recipe’ which describes the software, where to find it, what dependencies it needs both to build and run it and then some basic scripting to install it. The ‘recipe’ is then added to the Conda repository system where it is automatically ‘built’ into installable tool packages for various operating systems and hardware and then stored in a fully-supported, global repository.
Bioconda extends Conda into the life sciences and, in addition to making bioinformatics software installation much easier, improves analysis reproducibility by allowing users to define isolated environments with defined software versions, all of which are easily installed and managed without the need for administrative privileges.
It improves on other packaging systems by having an option to install tools in their own sandboxed environment so they don’t interfere with any other installed software. And every tool put into the repository automatically has a Docker container built for it.
Simon Gladman says,
The Bioconda project is very well organised with contributions to the repository via pull request and code review before merging. I’ve added roughly 30 packages to the Conda ecosystem (out of ~2500) since I started working with it, including our Microbial Genomics group’s most popular ones like Velvet Optimiser, Prokka and Snippy. I’ve also added tools I use a lot like Roary and Gubbins (Sanger Pathogens group). To progress the project we have held hackathons all over the world, the last one at the 2017 Galaxy conference in Montpellier.
About 2 years ago, the Galaxy project decided to experiment with using Conda and Bioconda as their preferred method of tool installation and they’ve now formally adopted it as standard. The latest version of Australia’s Genomics Virtual Laboratory (GVL) uses Bioconda to handle tool installations for Galaxy in the GVL and we’ve started working on ways to supply command line versions of the tools also.
An alternative packaging system Torsten Seemann contributes to is Homebrew Science. It pre-dates Bioconda and inspired many of the package formulae now employed in Bioconda.
January 2018 workshop
We have invited two Bioconda and Galaxy experts, Saskia Hiltemann (Erasmus University, The Netherlands) and Eric Rasche (Frieburg University, Germany) to run a Bioconda/Galaxy tool wrapping tutorial and workshop to help us build Australia’s capability in, and contributions to, this great community project.
Register your interest in this workshop with Christina Hall.
- a repository of recipes hosted on GitHub
- a build system that turns these recipes into conda packages
- a repository of >2700 bioinformatics and other packages ready to use with ‘conda install’
- over 250 contributors that add, modify, update and maintain the recipes
Follow the project on twitter: #bioconda
Watch 30 minute webinar from ELIXIR on Bioconda and Biocontainers by Björn Grüning (ELIXIR Germany).
Install your software using the conda system: after installing a conda system such as Miniconda, try ‘conda’ install <bioinformatics tool>.