conda config --add envs_dirs /zfs/omics/projects/bioinformatics/software/miniconda3/envs/
Chopper
Introduction
Chopper is a tool for quality filtering of long read data. It is a Rust implementation of two other tools for long-read quality filtering, NanoFilt and NanoLyse, both originally written in Python. This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file (De Coster and Rademakers 2023).
Installation
Installed on crunchomics: Yes,
- Chopper v0.8 is installed as part of the bioinformatics share. If you have access to crunchomics and have not yet access to the bioinformatics you can send an email with your Uva netID to Nina Dombrowski.
- Afterwards, you can add the bioinformatics share as follows (if you have already done this in the past, you don’t need to run this command):
If you want to install it yourself, you can run:
mamba create --name chopper -c bioconda chopper
Usage
Required input:
- FASTQ files
Output:
- FASTQ files
Example usage:
conda activate chopper0.8.0
gunzip -c results/porechop/my_reads.fastq.gz |\
chopper -q 10 \
--headcrop 0 --tailcrop 0 \
-l 1000 \
--threads 20 |\
gzip > results/chopper/my_reads_filtered1000.fastq.gz
conda deactivate
Useful arguments:
--headcrop
Trim N nucleotides from the start of a read [default: 0]--maxlength
Sets a maximum read length [default: 2147483647]-l
,--minlength
Sets a minimum read length [default: 1]-q
,--quality
Sets a minimum Phred average quality score [default: 0]--tailcrop
Trim N nucleotides from the end of a read [default: 0]--threads
Number of parallel threads to use [default: 4]--contam
Fasta file with reference to check potential contaminants against [default None]}
References
De Coster, Wouter, and Rosa Rademakers. 2023. “NanoPack2: Population-Scale Evaluation of Long-Read Sequencing Data.” Bioinformatics 39 (5): btad311. https://doi.org/10.1093/bioinformatics/btad311.