RNA-Seq Differential Gene Expression Analysis using Galaxy and the GVL
In this tutorial we cover the concepts of RNA-seq differential gene expression (DGE) analysis using a simulated dataset from the common fruit fly, Drosophila melanogaster.
The tutorial is designed to introduce the tools, datatypes and workflows of an RNA-seq DGE analysis. In practice, real datasets would be much larger and contain sequencing and alignment errors that make analysis more difficult.
In this tutorial we will:
- introduce the types of files typically used in RNA-seq analysis
- align RNA-seq reads with an aligner (HISAT2)
- visualise RNA-seq alignment data with IGV
- use a number of different methods to find differentially expressed genes
- understand the importance of replicates for differential expression analysis
This tutorial does not cover the following steps that we might do in a real RNA-seq DGE analysis:
- QC (quality control) of the raw sequence data
- Trimming the reads for quality and for adaptor sequences
- QC of the RNA-seq alignment data
These steps have been omitted because the data we use in this tutorial is synthetic and has no quality issues, unlike real data.
At the end of this tutorial you will be able to:
- understand the basic workflow of alignment, quantification, and testing, for RNA-seq differential expression analysis
- process raw RNA sequence data into a list of differentially expressed genes
- understand the relationship between the number of biological replicates in an experiment and the statistical power available to detect differentially expressed genes
Participants with no previous Galaxy experience are strongly recommended to attend the “Introduction to Galaxy” workshop first.
Attendees are required to bring their own laptop computers.