2022
DOI: 10.3389/fbinf.2022.905489
|View full text |Cite
|
Sign up to set email alerts
|

16S-ITGDB: An Integrated Database for Improving Species Classification of Prokaryotic 16S Ribosomal RNA Sequences

Abstract: Analyzing 16S ribosomal RNA (rRNA) sequences allows researchers to elucidate the prokaryotic composition of an environment. In recent years, third-generation sequencing technology has provided opportunities for researchers to perform full-length sequence analysis of bacterial 16S rRNA. RDP, SILVA, and Greengenes are the most widely used 16S rRNA databases. Many 16S rRNA classifiers have used these databases as a reference for taxonomic assignment tasks. However, some of the prokaryotic taxonomies only exist in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(11 citation statements)
references
References 54 publications
0
11
0
Order By: Relevance
“…Reads under 250 bp or exceeding a total expected error threshold of 0.5 were excluded using USEARCH. Next, Mothur’s (v1.48.0) commands ‘classify.seqs’ and ‘remove.lineage’ were used in combination with the 16S ITGDB database, a database for taxonomic classification of 16S rRNA sequences integrating sequences from RDP (version NO.18 trainset), SILVA (version 138) and Greengenes (version 13_8) [ 41 ], to eliminate potential mitochondrial, chloroplast and other non-target sequences. The remaining bacterial sequences were clustered into zero-radius operational taxonomic units (zOTUs [ 42 ], also known as amplicon sequence variants (ASVs) [ 43 ]) using the UNOISE3 algorithm as implemented in USEARCH [ 40 ].…”
Section: Methodsmentioning
confidence: 99%
“…Reads under 250 bp or exceeding a total expected error threshold of 0.5 were excluded using USEARCH. Next, Mothur’s (v1.48.0) commands ‘classify.seqs’ and ‘remove.lineage’ were used in combination with the 16S ITGDB database, a database for taxonomic classification of 16S rRNA sequences integrating sequences from RDP (version NO.18 trainset), SILVA (version 138) and Greengenes (version 13_8) [ 41 ], to eliminate potential mitochondrial, chloroplast and other non-target sequences. The remaining bacterial sequences were clustered into zero-radius operational taxonomic units (zOTUs [ 42 ], also known as amplicon sequence variants (ASVs) [ 43 ]) using the UNOISE3 algorithm as implemented in USEARCH [ 40 ].…”
Section: Methodsmentioning
confidence: 99%
“…The algorithm used to merge the Greengenes, SILVA, RDP, and vaginal processed databases was based on the integration algorithms proposed by Hsieh et al ( 10 ). The algorithm took two databases as inputs and integrated them as follows ( Fig.…”
Section: Methodsmentioning
confidence: 99%
“…The taxonomic nomenclature of the GSR database has been unified to guarantee the coherence of annotations. Its performance has been compared with Greengenes, Greengenes2 ( 9 ), GTDB, SILVA, and RDP databases and other existing integrated databases, including ITGDB ( 10 ) and MetaSquare ( 11 ). The GSR database is available for full-length 16S sequences and the most commonly used hypervariable regions: V4, V1–V3, V3–V4, and V3–V5.…”
Section: Introductionmentioning
confidence: 99%
“…Merging algorithm. The algorithm used to merge the Greengenes, SILVA, RDP, and vaginal processed databases was based on the integration algorithms proposed by Hsieh et al (9).…”
Section: Creation Of Gsr Databasementioning
confidence: 99%
“…The taxonomic nomenclature of the GSR database has been unified to guarantee the coherence of annotations. Its performance has been compared with Greengenes, GTDB, SILVA, and RDP databases and other existing integrated databases, including ITGDB (9) and MetaSquare (10). The GSR database is available for full-length 16S sequences and the most commonly used hypervariable regions: V4, V1-V3, V3-V4, and V3-V5.…”
Section: Introductionmentioning
confidence: 99%