Pyrosequencing-based 16S rRNA gene surveys are increasingly utilized to study highly diverse bacterial communities, with special emphasis on utilizing the large number of sequences obtained (tens to hundreds of thousands) for species richness estimation. However, it is not yet clear how the number of operational taxonomic units (OTUs) and, hence, species richness estimates determined using shorter fragments at different taxonomic cutoffs correlates with the number of OTUs assigned using longer, nearly complete 16S rRNA gene fragments. We constructed a 16S rRNA clone library from an undisturbed tallgrass prairie soil (1,132 clones) and used it to compare species richness estimates obtained using eight pyrosequencing candidate fragments (99 to 361 bp in length) and the nearly full-length fragment. Fragments encompassing the V1 and V2 (V1؉V2) region and the V6 region (generated using primer pairs 8F-338R and 967F-1046R) overestimated species richness; fragments encompassing the V3, V7, and V7؉V8 hypervariable regions (generated using primer pairs 338F-530R, 1046F-1220R, and 1046F-1392R) underestimated species richness; and fragments encompassing the V4, V5؉V6, and V6؉V7 regions (generated using primer pairs 530F-805R, 805F-1046R, and 967F-1220R) provided estimates comparable to those obtained with the nearly full-length fragment. These patterns were observed regardless of the alignment method utilized or the parameter used to gauge comparative levels of species richness (number of OTUs observed, slope of scatter plots of pairwise distance values for short and nearly complete fragments, and nonparametric and parametric species richness estimates). Similar results were obtained when analyzing three other datasets derived from soil, adult Zebrafish gut, and basaltic formations in the East Pacific Rise. Regression analysis indicated that these observed discrepancies in species richness estimates within various regions could readily be explained by the proportions of hypervariable, variable, and conserved base pairs within an examined fragment.Culture-independent 16S rRNA gene surveys are now routinely utilized to examine the microbial diversity in various environmental habitats. However, in surveys of highly diverse ecosystems, the size of clone libraries typically constructed (100 to 500 clones) allows for the identification only of members of the community that are present in high abundance (2,13,14,17,24,51). In addition to the failure to detect the rare members of the ecosystem, these relatively small datasets provide inaccurate estimates when used for computing species richness within an ecosystem. Regardless of the approach utilized to estimate species richness, the estimates obtained are highly dependent on sample size, and smaller datasets typically result in the underestimation of species richness (14,44,47,55).The use of a pyrosequencing-based approach (40) in 16S gene-based diversity surveys promises to overcome both of the above-mentioned problems associated with inadequate sampling. The large number of 16S...