2022
DOI: 10.1128/msystems.01408-21
|View full text |Cite
|
Sign up to set email alerts
|

Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis

Abstract: The NCBI-nr database is not explicitly designed for the purpose of microbiome analysis, and its increasing size makes its unwieldy and computationally expensive for this purpose. The AnnoTree protein database is only one-quarter the size of the full NCBI-nr database and is explicitly designed for metagenomic analysis, so it should be supported by alignment-based pipelines.

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
1
1

Relationship

1
9

Authors

Journals

citations
Cited by 23 publications
(20 citation statements)
references
References 46 publications
0
19
0
Order By: Relevance
“…The persistent genomes of species group A and B comprise 4,129 and 4,014 CDSs respectively or 79.93% and 76.63% of their mean gene number (Supplementary data). The detected coding sequences were further compared to the PlaBA-db and Annotree databases (Gautam et al, 2022;Mendler et al, 2019;Patz et al, 2021Patz et al, , 2024. In both cases, gene function does segregate the different strains into the previously determined species groups, suggesting different metabolic capacities and different ecologies (Figure 1F & 1G).…”
Section: Strains Affiliated To P Polymyxa Segregate Into Distinct Spe...mentioning
confidence: 86%
“…The persistent genomes of species group A and B comprise 4,129 and 4,014 CDSs respectively or 79.93% and 76.63% of their mean gene number (Supplementary data). The detected coding sequences were further compared to the PlaBA-db and Annotree databases (Gautam et al, 2022;Mendler et al, 2019;Patz et al, 2021Patz et al, , 2024. In both cases, gene function does segregate the different strains into the previously determined species groups, suggesting different metabolic capacities and different ecologies (Figure 1F & 1G).…”
Section: Strains Affiliated To P Polymyxa Segregate Into Distinct Spe...mentioning
confidence: 86%
“…In addition, the sequences of proteins such as RdRp (RNA-directed RNA polymerase), Rep (replication-associated protein) and NS1 (non-structural protein) were also downloaded from the RefSeq database to align contigs with sequence length > 1500 bp. The rma2info program built into MEGAN6[ 30 ] was used to perform taxonomic identification. Putative open reading frames (ORFs) were predicted by Geneious Prime with built-in parameters (Minimum size: 100)[ 27 ], and were further checked through comparing to related viruses.…”
Section: Methodsmentioning
confidence: 99%
“…As an additional control, the HQ reads were analyzed using the DIAMOND+MEGAN pipeline [ 25 , 26 ]. DIAMOND [ 27 ] blastx mode (–top 10 -f 100 -b24 -c1) was used to align the reads against the NCBI-nr database, and then the daa-meganizer tool (MEGAN Ultimate Edition version 6.22.1; -mdb megan-map-Feb2021-ue.db) [ 28 , 29 ] was used to perform taxonomic binning. Then, the daa2info tool was used to check the alignment (-c2c Taxonomy -u true -n true), establishing that a high percentage of reads were aligned to S. cerevisiae .…”
Section: Methodsmentioning
confidence: 99%