SILVA (from Latin silva, forest, http://www.arb-silva.de) is a comprehensive web resource for up to date, quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. The referred database release 111 (July 2012) contains 3 194 778 small subunit and 288 717 large subunit rRNA gene sequences. Since the initial description of the project, substantial new features have been introduced, including advanced quality control procedures, an improved rRNA gene aligner, online tools for probe and primer evaluation and optimized browsing, searching and downloading on the website. Furthermore, the extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.
Sequencing ribosomal RNA (rRNA) genes is currently the method of choice for phylogenetic reconstruction, nucleic acid based detection and quantification of microbial diversity. The ARB software suite with its corresponding rRNA datasets has been accepted by researchers worldwide as a standard tool for large scale rRNA analysis. However, the rapid increase of publicly available rRNA sequence data has recently hampered the maintenance of comprehensive and curated rRNA knowledge databases. A new system, SILVA (from Latin silva, forest), was implemented to provide a central comprehensive web resource for up to date, quality controlled databases of aligned rRNA sequences from the Bacteria, Archaea and Eukarya domains. All sequences are checked for anomalies, carry a rich set of sequence associated contextual information, have multiple taxonomic classifications, and the latest validly described nomenclature. Furthermore, two precompiled sequence datasets compatible with ARB are offered for download on the SILVA website: (i) the reference (Ref) datasets, comprising only high quality, nearly full length sequences suitable for in-depth phylogenetic analysis and probe design and (ii) the comprehensive Parc datasets with all publicly available rRNA sequences longer than 300 nucleotides suitable for biodiversity analyses. The latest publicly available database release 91 (August 2007) hosts 547 521 sequences split into 461 823 small subunit and 85 689 large subunit rRNAs.
16S ribosomal RNA gene (rDNA) amplicon analysis remains the standard approach for the cultivation-independent investigation of microbial diversity. The accuracy of these analyses depends strongly on the choice of primers. The overall coverage and phylum spectrum of 175 primers and 512 primer pairs were evaluated in silico with respect to the SILVA 16S/18S rDNA non-redundant reference dataset (SSURef 108 NR). Based on this evaluation a selection of ‘best available’ primer pairs for Bacteria and Archaea for three amplicon size classes (100–400, 400–1000, ≥1000 bp) is provided. The most promising bacterial primer pair (S-D-Bact-0341-b-S-17/S-D-Bact-0785-a-A-21), with an amplicon size of 464 bp, was experimentally evaluated by comparing the taxonomic distribution of the 16S rDNA amplicons with 16S rDNA fragments from directly sequenced metagenomes. The results of this study may be used as a guideline for selecting primer pairs with the best overall coverage and phylum spectrum for specific applications, therefore reducing the bias in PCR-based microbial diversity studies.
Motivation: In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements.Results: In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands.SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks.Availability: Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.Contact: epruesse@mpi-bremen.deSupplementary information: Supplementary data are available at Bioinformatics online.
SILVA (from Latin silva, forest, http://www.arb-silva.de) is a comprehensive resource for up-to-date quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. SILVA provides a manually curated taxonomy for all three domains of life, based on representative phylogenetic trees for the small- and large-subunit rRNA genes. This article describes the improvements the SILVA taxonomy has undergone in the last 3 years. Specifically we are focusing on the curation process, the various resources used for curation and the comparison of the SILVA taxonomy with Greengenes and RDP-II taxonomies. Our comparisons not only revealed a reasonable overlap between the taxa names, but also points to significant differences in both names and numbers of taxa between the three resources.
Publicly available sequence databases of the small subunit ribosomal RNA gene, also known as 16S rRNA in bacteria and archaea, are growing rapidly, and the number of entries currently exceeds 4 million. However, a unified classification and nomenclature framework for all bacteria and archaea does not yet exist. In this Analysis article, we propose rational taxonomic boundaries for high taxa of bacteria and archaea on the basis of 16S rRNA gene sequence identities and suggest a rationale for the circumscription of uncultured taxa that is compatible with the taxonomy of cultured bacteria and archaea. Our analyses show that only nearly complete 16S rRNA sequences give accurate measures of taxonomic diversity. In addition, our analyses suggest that most of the 16S rRNA sequences of the high taxa will be discovered in environmental surveys by the end of the current decade.
Phytoplankton blooms characterize temperate ocean margin zones in spring. We investigated the bacterioplankton response to a diatom bloom in the North Sea and observed a dynamic succession of populations at genus-level resolution. Taxonomically distinct expressions of carbohydrate-active enzymes (transporters; in particular, TonB-dependent transporters) and phosphate acquisition strategies were found, indicating that distinct populations of Bacteroidetes, Gammaproteobacteria, and Alphaproteobacteria are specialized for successive decomposition of algal-derived organic matter. Our results suggest that algal substrate availability provided a series of ecological niches in which specialized populations could bloom. This reveals how planktonic species, despite their seemingly homogeneous habitat, can evade extinction by direct competition.
Summary: JSpecies Web Server (JSpeciesWS) is a user-friendly online service for in silico calculating the extent of identity between two genomes, a parameter routinely used in the process of polyphasic microbial species circumscription. The service measures the average nucleotide identity (ANI) based on BLAST+ (ANIb) and MUMmer (ANIm), as well as correlation indexes of tetra-nucleotide signatures (Tetra). In addition, it provides a Tetra Correlation Search function, which allows to rapidly compare selected genomes against a continuously updated reference database with currently about 32 000 published whole and draft genome sequences. For comparison, own genomes can be uploaded and references can be selected from the JSpeciesWS reference database. The service indicates whether two genomes share genomic identities above or below the species embracing thresholds, and serves as a fast way to allocate unknown genomes in the frame of the hitherto sequenced species. Availability and implementation: JSpeciesWS is available at http://jspecies.ribohost.com/jspeciesws. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: mrichter@ribocon.com
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.