This lesson is still being designed and assembled (Pre-Alpha version)

# Introduction to RNA-seq: Welcome!

## Welcome!

In this training you will learn the basics of a typical RNA-Sequencing experiment. It is going to be fun and empowering! You will discover how total RNA are converted to short sequences called “reads” that can in turn be used to get insights into gene expression. Through careful experimental design, these gene expression information can yield new research avenues and answer crucial questions.

We will use mostly R and its companion RStudio to perform our RNA-Seq analyses and visualisations.

Depending on the level of participants, the bioinformatic part might be performed (QC of fastq files, genome alignment, counting, etc.)

Before you begin, be sure you are all set up (see below). For complete information, see the Setup section.

We will use mostly R and its companion RStudio to perform our RNA-Seq analyses and visualisations.

Depending on the level of participants, the bioinformatic part might be performed (QC of fastq files, genome alignment, counting, etc.)

This lesson will introduce you to the basics of gene expression analysis using RNA-Seq (short for RNA sequencing). Due to the considerable progress and constant decreasing costs of RNA-Seq, this technique has became a standard

## Main learning objectives

• Identify good practices when designing a RNA-Seq experiment.
• Memorize the steps of a complete RNA-Seq experiment: from sequencing to analysis.
• Perform a QC of your experiment through Principal Component Analysis (PCA) and sample clustering.
• Execute a differential gene expression analysis using R and the DESeq2 package.
• Be able to create key plots: volcano plot, heatmap and clustering of differentially expressed genes.
• Provide a biological interpretation to differentially expressed genes through ORA/GSEA analyses and data integration.

## Before you start

Before the training, please make sure you have done the following:

1. Consult what you need to do in the lesson Setup.
2. Read the workshop Code of Conduct to make sure this workshop stays welcoming for everybody.
3. Get comfortable: if you’re not in a physical workshop, be set up with two screens if possible. You will be following along in RStudio on your own computer while also following this tutorial on your own. More instructions are available on the workshop website in the Setup section.

## Citation

If you make use of this material in some way (teaching, vocational training, research), please cite us: “Bliek Tijs, Frans van der Kloet and Marc Galland” (eds): “RNA-seq lesson.” Version 2020.04. https://github.com/ScienceParkStudyGroup/rnaseq-lesson

## Credits

This lesson is heavily based on teaching materials from the Harvard Chan Bioinformatics Core (HBC) in-depth NGS data analysis course. Materials have been adapted and some exercises created to comply with the Carpentries Foundation teaching requirements.

## Schedule

 Setup Download files required for the lesson 00:00 1. Introduction What can I learn by doing this RNA-Seq lesson? What are the tools that I will be using? What are the tidy data principles? What is working in a more open way beneficial? 00:30 2. Statistics & Experimental design What are the key statistical concepts I need to know for experimental design? What are type I and type II errors? What are the source of variability in an experiment? What are the 3 cores principles of (good) experimental design? Why is having biological replicates important in an (RNA-seq) experiment? 02:30 3. Library preparation and QC What are the current techniques for RNA-seq using NGS? How is sequencing quality assessed? What are the file format that is yielded by sequencing machines? What tool can I use to assess the quality of my RNA-seq sequencing files? 03:15 4. From fastq files to read counts How do I perform a quality check of my RNA-seq fastq files with FastQC? How can I remove RNA-seq reads of low quality? using trimmomatic? How do I align my reads to a reference genome using STAR? What is the SAM/BAM format? How do I turn RNA-seq read genome alignments into a count table? 04:00 5. Exploration of RNA-seq count results How are gene expression levels distributed within a RNA-seq experiment? Why do I need to scale/normalize read counts? How do I know that my RNA-seq experiment has worked according to my experimental design? How informative is PCA and sample clustering for RNA-seq quality checks? 05:30 6. Differential expression analysis What are factor levels and why is it important for different expression analysis? How can I call the genes differentially regulated in response to my experimental design? What is a volcano plot and how can I create one? What is a heatmap and how can it be informative for my comparison of interest? 07:00 7. Going beyond a list of genes What are factor levels and why is it important for different expression analysis? 08:30 8. Genome browser exploration 09:30 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.