#install the conda environment
conda create -n motus_3.1.0
conda install -n motus_3.1.0 -c bioconda
#download the required databases
#the database will get downloaded to gets downloaded to: conda_env_folder/motus_3.1.0/lib/python3.9/site-packages/motus
conda activate motus_3.1.0
motus downloadDBmOTUS
Introduction
The mOTU profiler is a computational tool that estimates relative taxonomic abundance of known and currently unknown microbial community members using metagenomic shotgun sequencing data (Ruscheweyh et al. 2022). Check the wiki for more information.
Installation
Available on Crunchomics: No
You can install mOTUs with conda/mamba as follows:
Usage
Required inputs: Illumina reads (forward, reverse, unpaired) or Nanopore reads in fastq(.gz) format
Generated output(s):
- a file with the relative abundance (or read counts) of the prokaryotic species found in the sample. More about the output can be found here
- the results of the read mapping in BAM format (optional)
Example for short-read Illumina data
#prepare folders
mkdir data
mkdir -p results/motus_sr
#run motus
motus profile -f data/SRR17913199_1.fastq -r data/SRR17913199_2.fastq \
-n SRR17913199 \
-t 30 \
-o results/motus_sr/taxonomy_profile.txt
#parse the results and extract only hits >0 in decreasing order
awk -F'\t' -v OFS="\t" '!/^#/ && $2 > 0' results/motus_sr/taxonomy_profile.txt | sort -t$'\t' -k2,2nrExample for long-read data
#prepare folders
mkdir -p results/motus_lr
#prepare long-reads to be profiled by mOTUs, this splits long reads into short reads of length 300
srun --cpus-per-task 1 --mem=10GB motus prep_long -i data/SRR17913199.fastq \
-o data/SRR17913199_prepped.fastq -n SRR17913199
#run motus
srun --cpus-per-task 30 --mem=10GB motus profile -s data/SRR17913199_prepped.fastq \
-n SRR17913199 \
-t 30 \
-o results/motus_lr/taxonomy_profile.txt
#parse the results and extract only hits >0 in decreasing order
awk -F'\t' -v OFS="\t" '!/^#/ && $2 > 0' results/motus_lr/taxonomy_profile.txt | sort -t$'\t' -k2,2nrUseful arguments for motus profile:
Input options:
-fFILE[,FILE] input file(s) for reads in forward orientation, fastq(.gz)-formatted-rFILE[,FILE] input file(s) for reads in reverse orientation, fastq(.gz)-formatted-sFILE[,FILE] input file(s) for unpaired reads, fastq(.gz)-formatted-nSTR sample name [‘unnamed sample’]-iFILE[,FILE] provide SAM or BAM input file(s) (generated by motus map_tax)-mFILE provide a mgc reads count file (generated by motus calc_mgc)-dbDIR provide a different database directory
Output options:
-oFILE output file name [stdout]-IFILE save the result of BWA in BAM format (output of motus map_tax)-MFILE save the mgc reads count (output of motus calc_mgc)-eonly species with reference genomes (ref-mOTUs)-uprint the full name of the species-cprint result as counts instead of relative abundances-pprint NCBI taxonomy identifiers-Bprint result in BIOM format-CSTR print result in CAMI format (BioBoxes format 0.9.1), Values: [precision, recall, parenthesis]-qprint the full rank taxonomy-Aprint all taxonomic levels together (kingdom to mOTUs, override -k)-kSTR taxonomic level [mOTU], Values: [kingdom, phylum, class, order, family, genus, mOTU]
Algorithm options:
-gINT number of marker genes cutoff: 1=higher recall, 6=higher precision [3]-lINT min length of the alignment (bp) [75]-tINT number of threads [1]-vINT verbosity level: 1=error, 2=warning, 3=message, 4+=debugging [3]-ySTR type of read counts [insert.scaled_counts], Values: [base.coverage, insert.raw_counts, insert.scaled_counts]
Merging profiles
If you generated several profiles for different samples, you can merge them by providing a list of samples like this:
motus merge -i sampleX.motus,sampleY.motus,sampleZ.motus \
-o results/motus/merged.txtNotice: For this to work its best to use the -n option when using motus profile to ensure that each sample is has a clear identifier in the abundance table.
You can also merge different profiles if they are all in the same directory like this:
motus merge -d results/motus \
-o results/motus/merged.txt