Structural variant calling using long read data – new
Learn how to identify structural variants in genomes using Galaxy
Presenter: Grace Hall
Structural variations are involved in many aspects of medicine and biotechnology. Their importance in various diseases, including cancer, and their relevance to microbial evolution has made the field a prominent area of research.
The advent of long read technology has recently enabled more accurate analysis of structural variants (SVs), invigorating the field and permitting numerous discoveries in the last half-decade.
In this training workshop you will gain an introduction to structural variation calling using long read data. We will discuss what structural variants are, their importance in medicine and biotechnology, and how they are identified and visualised using modern online bioinformatics tools. We will also touch on the interpretation of our SV calls, by calculating performance metrics for our SV caller, sniffles.
This workshop will cover all major steps to bioinformatics analysis. We will discuss the initial data, the SV calling pipeline, visualisation of results, and interpretation and communication of our findings.
Theory behind SV types, their role in medicine and biotechnology, long read technologies, and read alignment will also be communicated.
There is currently no ‘best practice’ for SV calling. We will use a workflow consisting of read alignment with minimap2, SV calling with sniffles, text reformatting of VCF files with awk, and visualisation with circos to perform our analysis.
Benchmarking will be performed on our SV caller – sniffles – to estimate its accuracy, precision and recall. This performance information will be vital as we switch to a human read set, as will permit better interpretation of the results we obtain.
Tools used: Galaxy: minimap2, sniffles, awk, circos
Prerequisites & requirements
This workshop does not require any programming experience but familiarity with Galaxy and Galaxy workflows, or completion of the online “Introduction to Galaxy” workshop is required. You will write one line programs using awk to reformat your variant calls, but this code will be provided and explained.
This is a hands-on workshop and attendees must provide their own laptops for the workshop with the following software pre-installed.
- Web browser (Firefox or Chrome recommended)
- Zoom (Version 5 or greater)
Most workshops are FREE for all researchers and students from the University of Melbourne and its affiliated institutes only (refer ‘Members’ section).
Using Containers in Bioinformatics workshop
This introductory online workshop is offered nationally free of charge by the Australian BioCommons and Pawsey Supercomputing Centre. Melbourne Bioinformatics is contributing experts to facilitate from the University of Melbourne.
Places are limited. Applications close 5pm AEDT Monday 15 March 2021.