The rRNA approach is the principal tool to study microbial diversity, but it has important biases. These include polymerase chain reaction (PCR) primers bias, and relative inefficiency of DNA extraction techniques. Such sources of potential undersampling of microbial diversity are well known, but the scale of the undersampling has not been quantified. Using a marine tidal flat bacterial community as a model, we show that even with unlimited sampling and sequencing effort, a single combination of PCR primers/DNA extraction technique enables theoretical recovery of only half of the richness recoverable with three such combinations. This shows that different combinations of PCR primers/DNA extraction techniques recover in principle different species, as well as higher taxa. The majority of earlier estimates of microbial richness seem to be underestimates. The combined use of multiple PCR primer sets, multiple DNA extraction techniques, and deep community sequencing will minimize the biases and recover substantially more species than prior studies, but we caution that even this-yet to be used-approach may still leave an unknown number of species and higher taxa undetected.
Microbial diversity and distribution are topics of intensive research. In two companion papers in this issue, we describe the results of the Cariaco Microbial Observatory (Caribbean Sea, Venezuela). The Basin contains the largest body of marine anoxic water, and presents an opportunity to study protistan communities across biogeochemical gradients. In the first paper, we survey 18S ribosomal RNA (rRNA) gene sequence diversity using both Sanger-and pyrosequencing-based approaches, employing multiple PCR primers, and state-of-the-art statistical analyses to estimate microbial richness missed by the survey. Sampling the Basin at three stations, in two seasons, and at four depths with distinct biogeochemical regimes, we obtained the largest, and arguably the least biased collection of over 6000 nearly full-length protistan rRNA gene sequences from a given oceanographic regime to date, and over 80 000 pyrosequencing tags. These represent all major and many minor protistan taxa, at frequencies globally similar between the two sequence collections. This large data set provided, via the recently developed parametric modeling, the first statistically sound prediction of the total size of protistan richness in a large and varied environment, such as the Cariaco Basin: over 36 000 species, defined as almost full-length 18S rRNA gene sequence clusters sharing over 99% sequence homology. This richness is a small fraction of the grand total of known protists (over 100 000-500 000 species), suggesting a degree of protistan endemism.
Microorganisms are spectacularly diverse phylogenetically, but available estimates of their species richness are vague and problematic. For example, for comparable environments, the estimated numbers of species range from a few dozen or hundreds to tens of thousands and even half a million. Such estimates provide no baseline information on either local or global microbial species richness. We argue that this uncertainty is due in large part to the way statistical tools are used, if not indeed misused, in biodiversity research. Here we develop a powerful synthetic statistical approach to quantify biodiversity. It provides statistically sound estimates of microbial richness at any level of taxonomic hierarchy. We apply this approach to a large original 16S rRNA dataset on marine bacterial diversity and show that the number of bacterial species in a sample from marine sediments is (2.4 ؎ 0.5 SE) ؋ 10 3 . We argue that our methodology provides estimates of microbial richness that are reliable and general, have biologically meaningful SEs, and meet other fundamental statistical standards. This approach can be an essential tool in biodiversity research, and the estimates of microbial richness presented here can serve as a baseline in microbial diversity studies.global biodiversity ͉ microorganisms ͉ number of species
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.