mOTUS

Introduction

The mOTU profiler is a computational tool that estimates relative taxonomic abundance of known and currently unknown microbial community members using metagenomic shotgun sequencing data (Ruscheweyh et al. 2022). Check the wiki for more information.

Installation

Available on Crunchomics: No

You can install mOTUs with conda/mamba as follows:

#install the conda environment
conda create -n motus_3.1.0
conda install -n motus_3.1.0 -c bioconda

#download the required databases
#the database will get downloaded to gets downloaded to: conda_env_folder/motus_3.1.0/lib/python3.9/site-packages/motus
conda activate motus_3.1.0
motus downloadDB

Usage

Required inputs: Illumina reads (forward, reverse, unpaired) or Nanopore reads in fastq(.gz) format

Generated output(s):

a file with the relative abundance (or read counts) of the prokaryotic species found in the sample. More about the output can be found here
the results of the read mapping in BAM format (optional)

Example for short-read Illumina data

#prepare folders
mkdir data
mkdir -p results/motus_sr

#run motus
motus profile -f data/SRR17913199_1.fastq -r data/SRR17913199_2.fastq \
    -n SRR17913199 \
    -t 30 \
    -o results/motus_sr/taxonomy_profile.txt

#parse the results and extract only hits >0 in decreasing order 
awk -F'\t' -v OFS="\t" '!/^#/ && $2 > 0' results/motus_sr/taxonomy_profile.txt | sort -t$'\t' -k2,2nr

Example for long-read data

#prepare folders 
mkdir -p results/motus_lr

#prepare long-reads to be profiled by mOTUs, this splits long reads into short reads of length 300
srun --cpus-per-task 1 --mem=10GB  motus prep_long -i data/SRR17913199.fastq \
    -o data/SRR17913199_prepped.fastq -n SRR17913199

#run motus
srun --cpus-per-task 30 --mem=10GB motus profile -s data/SRR17913199_prepped.fastq \
    -n SRR17913199 \
    -t 30 \
    -o results/motus_lr/taxonomy_profile.txt

#parse the results and extract only hits >0 in decreasing order 
awk -F'\t' -v OFS="\t" '!/^#/ && $2 > 0' results/motus_lr/taxonomy_profile.txt | sort -t$'\t' -k2,2nr

Useful arguments for motus profile:

Input options:

-f FILE[,FILE] input file(s) for reads in forward orientation, fastq(.gz)-formatted
-r FILE[,FILE] input file(s) for reads in reverse orientation, fastq(.gz)-formatted
-s FILE[,FILE] input file(s) for unpaired reads, fastq(.gz)-formatted
-n STR sample name [‘unnamed sample’]
-i FILE[,FILE] provide SAM or BAM input file(s) (generated by motus map_tax)
-m FILE provide a mgc reads count file (generated by motus calc_mgc)
-db DIR provide a different database directory

Output options:

-o FILE output file name [stdout]
-I FILE save the result of BWA in BAM format (output of motus map_tax)
-M FILE save the mgc reads count (output of motus calc_mgc)
-e only species with reference genomes (ref-mOTUs)
-u print the full name of the species
-c print result as counts instead of relative abundances
-p print NCBI taxonomy identifiers
-B print result in BIOM format
-C STR print result in CAMI format (BioBoxes format 0.9.1), Values: [precision, recall, parenthesis]
-q print the full rank taxonomy
-A print all taxonomic levels together (kingdom to mOTUs, override -k)
-k STR taxonomic level [mOTU], Values: [kingdom, phylum, class, order, family, genus, mOTU]

Algorithm options:

-g INT number of marker genes cutoff: 1=higher recall, 6=higher precision [3]
-l INT min length of the alignment (bp) [75]
-t INT number of threads [1]
-v INT verbosity level: 1=error, 2=warning, 3=message, 4+=debugging [3]
-y STR type of read counts [insert.scaled_counts], Values: [base.coverage, insert.raw_counts, insert.scaled_counts]

Merging profiles

If you generated several profiles for different samples, you can merge them by providing a list of samples like this:

motus merge -i sampleX.motus,sampleY.motus,sampleZ.motus \
    -o results/motus/merged.txt

Notice: For this to work its best to use the -n option when using motus profile to ensure that each sample is has a clear identifier in the abundance table.

You can also merge different profiles if they are all in the same directory like this:

motus merge -d results/motus \
    -o results/motus/merged.txt

References

Ruscheweyh, Hans-Joachim, Alessio Milanese, Lucas Paoli, Nicolai Karcher, Quentin Clayssen, Marisa Isabell Keller, Jakob Wirbel, et al. 2022. “Cultivation-Independent Genomes Greatly Expand Taxonomic-Profiling Capabilities of mOTUs Across Various Environments.” Microbiome 10 (1). https://doi.org/10.1186/s40168-022-01410-z.