Introduction to RNA-seq


UvA Amsterdam


01-02 July 2021

9:30 am - 5:00 pm

Instructors: Marc Galland1, Tijs Bliek1

Helpers: Fred White





1: member of the SILS institute

General Information


This workshop is meant to teach important notions in RNA-Seq such as how to align sequencing reads to a genome, how to perform a differential expression analysis and how to go beyond a list of genes (functional enrichment, integration with metabolic pathways). We will first introduce RNA-Seq experimental design and main post-sequencing steps, what is important to know in RNA-Seq, etc. to give an overview of what you can achieve with this technique. Then it will be followed by a hands-on session where we will execute the different steps one by one in the cloud and look at the output files.
This workshop is part of the 2021 Summer School organised by the Amsterdam Science Park Study Group. This event is made possible thanks to the support of the NWO Team Science Award (2020 edition).

Who are we

This workshop is organized by the core members of the Amsterdam Science Park Study Group. This small community of computational biologists aims to promote skill sharing and collaboration through the organisation of interactive workshops. It acts as the main local hub to set-up Software and Data Carpentry workshops (official workshops and Carpentry-style). All are welcome to this study group, regardless of scientific research area, affiliation or training level.

For more information on what we teach and why, see our website: "scienceparkstudygroup".




Who: The course is aimed at master students and other researchers (PhD. and postdoc). If you have no previous knowledge on R, you are strongly adviced to follow the R-workshop also included in this summer school

Where: Online, Amsterdam UTC+1 (see Zoom links). Get directions with OpenStreetMap or Google Maps.

Zoom links: the workshop will be fully online:

When: 01-02 July 2021. Add to your Google Calendar.

Requirements: Participants must dispose of a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).

Code of Conduct: Everyone who participates in Carpentries activities is required to conform to the Code of Conduct. This document also outlines how to report an incident if needed.

Helpers: Experts helping in the room.:

Discord server: We will use Discord to manage questions, general announcements and to match helpers with learners. Please install the Discord application on your laptol/computer. The invite link to the server can be found under this link There are several channels that we will use:

Contact: Please email m.bliek@uva.nl for more information.





Schedule

Day 1

Morning
(9:30-11:00)
Introduction
(Marc)
Coffee
Morning
(11:15-12:45)
Fastq file fomat and Quality control.
(Marc)
Lunch
Afternoon
(13:45-15:15)
Mapping to the reference genome.
(Tijs)
Tea
Afternoon
(15:30-17:00)
The Bam/Sam file format and counting.
(Tijs)

Day 2

Morning
(9:30-11:00)
Exploration of RNA-seq count results
(Marc)
Coffee
Morning
(11:15-12:45)
Differential expression analysis.
(Marc)
Lunch
Afternoon
(13:45-15:15)
Cluster analysis
(Tijs)
Tea
Afternoon
(15:30-17:00)
Functional enrichment analysis.
(Tijs)




Syllabus




Setup

To participate in a this workshop, you will need access to the software described below. In addition, you will need an up-to-date web browser.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

Virtual Machines

Shell Virtual Machines (day 1)

Virtual machines will be used through the course. A virtual machine is a sort of mini-computer running in the cloud (someone else computer). This will ensure that we all dispose of the same compute power and software.
Machines on day 1 will be used on the first day to perform the bioinformatic parts of RNA-seq (read trimming, alignment to a genome etc.).
Your credentials to connect to the machine will be sent to you separately through email (a .csv file).
Steps to complete
  1. Open your Shell,
  2. Type shh root@[your IP address] (see your machine IP address in the credential files),
  3. You will be asked to confirm the authenticity of the host. Answer "yes",
  4. You will be then asked to type in your password. This is the one given to you in the credential .csv file,
  5. You are then asked to choose a password. Choose something simple but long enough (e.g. your first name followed by your birth date), Security is not so much an issue here since all VMs are destroyed after the course,
  6. Re-type the same password.
  7. Type: docker run -it --name bioinfo -v $PWD:/home/ scienceparkstudygroup/master-gls:fastq-latest to create a container called bioinfo and enter it through an interactive bash. You can refer to the setup of the lesson for additional commands.
Virtual machine setup

R/RStudio Virtual machines (day 2)

On day 2, we will use RStudio virtual machines accessible through a simple web browser. Simply browse to the web link provided to you in the credential file. Enter rstudio as the username and your provided password to access your RStudio machine.

The Bash Shell

Bash is a commonly-used shell that gives you the power to do simple tasks more quickly.

Video Tutorial
  1. Download the Git for Windows installer.
  2. Run the installer and follow the steps below:
    1. Click on "Next" four times (two times if you've previously installed Git). You don't need to change anything in the Information, location, components, and start menu screens.
    2. Select "Use the nano editor by default" and click on "Next".
    3. Keep "Git from the command line and also from 3rd-party software" selected and click on "Next". If you forgot to do this programs that you need for the workshop will not work properly. If this happens rerun the installer and select the appropriate option.
    4. Click on "Next".
    5. Select "Use the native Windows Secure Channel library", and click "Next".
    6. Keep "Checkout Windows-style, commit Unix-style line endings" selected and click on "Next".
    7. Select "Use Windows' default console window" and click on "Next".
    8. Leave all three items selected, and click on "Next".
    9. Do not select the experimental option. Click "Install".
    10. Click on "Finish".
  3. If your "HOME" environment variable is not set (or you don't know what this is):
    1. Open command prompt (Open Start Menu then type cmd and press [Enter])
    2. Type the following line into the command prompt window exactly as shown:

      setx HOME "%USERPROFILE%"

    3. Press [Enter], you should see SUCCESS: Specified value was saved.
    4. Quit command prompt by typing exit then pressing [Enter]

This will provide you with both Git and Bash in the Git Bash program.

The default shell in all versions of macOS is Bash, so no need to install anything. You access Bash from the Terminal (found in /Applications/Utilities). See the Git installation video tutorial for an example on how to open the Terminal. You may want to keep Terminal in your dock for this workshop.

The default shell is usually Bash, but if your machine is set up differently you can run it by opening a terminal and typing bash. There is no need to install anything.


Text Editor

When you're writing code, it's nice to have a text editor that is optimized for writing code, with features like automatic color-coding of key words. The default text editor on macOS and Linux is usually set to Vim, which is not famous for being intuitive. If you accidentally find yourself stuck in it, hit the Esc key, followed by :+Q+! (colon, lower-case 'q', exclamation mark), then hitting Return to return to the shell.

nano is a basic editor and the default that instructors use in the workshop. It is installed along with Git.

Others editors that you can use are Notepad++ or Sublime Text. Be aware that you must add its installation directory to your system path. Please ask your instructor to help you do this.

nano is a basic editor and the default that instructors use in the workshop. See the Git installation video tutorial for an example on how to open nano. It should be pre-installed.

Others editors that you can use are BBEdit or Sublime Text.

nano is a basic editor and the default that instructors use in the workshop. It should be pre-installed.

Others editors that you can use are Gedit, Kate or Sublime Text.


R

R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.

Video Tutorial

Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio IDE. Note that if you have separate user and admin accounts, you should run the installers as administrator (right-click on .exe file and select "Run as administrator" instead of double-clicking). Otherwise problems may occur later, for example when installing R packages.

You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base and for Fedora run sudo dnf install R). Also, please install the RStudio IDE.