RNA-Seq Differential Gene Expression Analysis using Galaxy and the GVL
In this tutorial we cover the concepts of RNA-seq differential gene expression (DGE) analysis using a simulated dataset from the common fruit fly, Drosophila melanogaster. (more…)
Genome Browsers and using UCSC genome Browser tools
This workshop will provide an overview of different types of browsers and a detailed hands on tutorial on the UCSC genome browser, illustrating a number of tools. We will use example data to demonstrate how to make and upload custom tracks on the UCSC genome browser and use the Table browser to retrieve the genomic positions for targets of interest for a custom bed file.
You will also learn how to load and view publicly available RNA-seq data in the Integrative Genome Browser (Broad Institute) and visualise the individual reads of NGS data. (more…)
Introduction to Variant Calling with Galaxy & the GVL
The tutorial is designed to introduce the tools, datatypes and workflow of variant detection. We will align reads to the genome, look for differences between reads and reference genome sequence, and filter the detected genomic variation manually to understand the computational basis of variant calling.
In this tutorial we cover the concepts of detecting small variants (SNVs and indels) in human genomic DNA using a small set of reads from chromosome 22.
At the end of this course you will be able to:
- Work with the FASTQ format and base quality scores
- Align reads to generate a BAM file and subsequently generate a pileup file
- Run the FreeBayes variant caller to find SNVs and indels
- Visualise BAM files using the Integrative Genomics Viewer (IGV) and identify likely SNVs and indels by eye
This is a hands-on workshop and attendees should bring their own laptops.
Introduction to long-read genome assembly
New genome sequencing technologies are producing much longer reads. This workshop combines an introductory presentation on the theory of de novo genome assembly with hands-on practice. We will use a cut-down data set of bacterial FASTQ reads from PacBio (long read) sequencing. Using command-line tools, we will assemble the reads with the tool Canu and correct the assembly with short read Illumina data. (more…)
Data tidying with Python and Pandas
This workshop covers practical approaches for handling data in Python. We will use the Python library Pandas. This workshop is a recommended prerequisite for the Data Visualisation workshop. In order to do effective data analysis or visualisation, we usually need to have our data cleaned and in a consistent format. We will cover the concept of “tidy”, and long-form, and wide-form data, and hands-on approaches for manipulating data and fixing common problems. This workshop concentrates on tabular data, like that found in spreadsheets or databases. (more…)
Data visualisation with Python
Python has a wide range of libraries for plotting and visualising data. Many of these are excellent, but it can be hard for a newcomer to know where to start. We will introduce the range of options available, then do hands-on visualisation exercises with some popular libraries: Matplotlib, Seaborn, and Altair. Seaborn builds on Matplotlib to easily create beautiful statistical visualisations. Altair is intended for interactive visualisation and makes it easy to create complex responsive visualisations. (more…)
Common Workflow Language for Bioinformatics
Common Workflow Language (or CWL), is a growing language for defining workflows in a cross-platform and cross-domain manner. In biology in particular, we need workflows to automate complex analyses such as DNA variant calling, RNA sequencing, and genome assembly. CWL provides a simple and well-defined format for automating these analysis by specifying their stages and connections using readable CWL documents. (more…)
Containerised Bioinformatics: Docker and other tools for reproducible analysis
This beginners workshop will explain how containerisation can be used in bioinformatics analysis.
Containerisation is a method of bundling an application or pipeline with all its dependencies, from language runtimes like Python and R to the operating system itself. This technology has already revolutionised web development by providing a simple way to run web applications a in precisely controlled environment, regardless of which computer system they are running on.
This workshop will explain how these advantages can be easily applied to bioinformatics analysis, to ensure 100% reproducibility of your work, along with easy distribution of your pipelines to other users without the need for complex installation. (more…)
Introduction to RNA-seq analysis using R – coming soon
Best Practices in Bioinformatics Software Development
Software development is a central part of bioinformatics, but for many reasons software quality is not always prioritised, leading to problems in maintenance, usability and reproducibility. Adopting software engineering best practices at the beginning of a project can address these problems, but this is often not done due to lack of time and/or experience. This workshop covers the essentials of good programming practices and provides you with tools and knowledge to build high quality bioinformatics software from the outset. We will introduce a tool for quickly creating new software projects with important features and infrastructure already included. You will use this tool to initialise a new project including a fresh repository on GitHub.
Bioinformaticians with beginner to intermediate level of programming experience who want to apply good software engineering practices in their daily work. Experience with the Unix command-line is assumed. Basic familiarity with Python (or similar languages) is an advantage.