Nanoplot

Introduction

NanoPlot is a plotting tool for long read sequencing data and alignment (De Coster et al. 2018).

Available on Crunchomics: Not by default

Installation

Installed on crunchomics: Yes,

  • Nanoplot v1.42.0 is installed as part of the bioinformatics share. If you have access to crunchomics and have not yet access to the bioinformatics you can send an email with your Uva netID to Nina Dombrowski.
  • Afterwards, you can add the bioinformatics share as follows (if you have already done this in the past, you don’t need to run this command):
conda config --add envs_dirs /zfs/omics/projects/bioinformatics/software/miniconda3/envs/

NanoPlot is part of the Nanopack package and I would recommend installing this package to already have other useful tools installed. Therefore, we install a new conda environment called nanopack. If you already have an environment with tools for long-read analyses I suggest adding nanopack there instead.

#setup new conda environment, which we name nanopack
mamba create --name nanopack -c conda-forge -c bioconda python=3.6 pip

#activate environment
conda activate nanopack 

#install nanopack software tools 
$HOME/personal/mambaforge/envs/nanopore/bin/pip3 install nanopack

#close environment
conda deactivate

Usage

Possible input formats :

  • fastq files (can be bgzip, bzip2 or gzip compressed)
  • fastq files generated by albacore, guppy or MinKNOW containing additional information (can be bgzip, bzip2 or gzip compressed)
    • sorted bam files
  • sequencing_summary.txt output table generated by albacore, guppy or MinKnow basecalling (can be gzip, bz2, zip and xz compressed)
  • fasta files (can be bgzip, bzip2 or gzip compressed); Multiple files of the same type can be offered simultaneously

Output:

  • a statistical summary
  • a number of plots
  • a html summary file

Example code:

#start environment
conda activate nanoplot_1.42.0

#run on a single file (if you are using other inputs, check the readme for the appropriate flag)
NanoPlot --fastq myfile.fastq.gz -o outputfolder --threads 1

conda deactivate

Useful arguments (for the full version, check the manual):

  • --tsv_stats Output the stats file as a properly formatted TSV.
  • --info_in_report Add NanoPlot run info in the report.
  • --barcoded Use if you want to split the summary file by barcode
  • -f, --format {[{png,jpg,jpeg,webp,svg,pdf,eps,json} …] } Specify the output format of the plots, which are in addition to the html files

References

De Coster, Wouter, Svenn D’Hert, Darrin T Schultz, Marc Cruts, and Christine Van Broeckhoven. 2018. “NanoPack: Visualizing and Processing Long-Read Sequencing Data.” Bioinformatics 34 (15): 2666–69. https://doi.org/10.1093/bioinformatics/bty149.