#setup new conda environment, which we name kraken2
mamba create --name kraken2 -c bioconda kraken2
Kraken2
Introduction
Kraken2 (Wood, Lu, and Langmead 2019) is a taxonomic sequence classifier that assigns taxonomic labels to DNA sequences. Kraken examines the k-mers within a query sequence and uses the information within those k-mers to query a database. That database maps k-mers to the lowest common ancestor (LCA) of all genomes known to contain a given k-mer.
Available on Crunchomics: Kraken version 2.0.8-beta installed
Installation
If you want to install kraken2 on your own, its best to install it via mamba:
Usage
For detailed usage information, check out the kraken2 manual.
Build a kraken database
The command below will download NCBI taxonomic information, as well as the complete genomes in RefSeq for the bacterial, archaeal, and viral domains, along with the human genome and a collection of known vectors (UniVec_Core). After downloading all this data, the build process begins; this can be the most time-consuming step. If you have multiple processing cores, you can run this process with multiple threads.
Addtionally, kraken2 comes with several custom databases, such as the SILVA database for 16S rRNA gene analyses. Check the kraken2 manual for detailed information on how to download custom things..
#create a kraken2 database
kraken2-build --standard --threads 24 --db $DBNAME
Classification
kraken2 --db $DBNAME seqs.fa --output output.out --report output.report