, and the sequence data generally yield a consensus sequence. Here we describe valuable data that are missing from consensus sequences, variable effects on sequence data generated from nonidentical 16S rRNA amplicons, and the appearance of data displayed by different software programs. These effects are illustrated by analysis of 16S rRNA genes from 50 strains of the Bacillus cereus group, i.e., Bacillus anthracis, Bacillus cereus, Bacillus mycoides, and Bacillus thuringiensis. These species have 11 to 14 rRNA operons, and sequence variability occurs among the multiple 16S rRNA genes. A single nucleotide polymorphism (SNP) previously reported to be specific to B. anthracis was detected in some B. cereus strains. However, a different SNP, at position 1139, was identified as being specific to B. anthracis, which is a biothreat agent with high mortality rates. Compared with visual analysis of the electropherograms, basecaller software frequently missed gene sequence variations or could not identify variant bases due to overlapping basecalls. Accurate detection of 16S rRNA gene sequences that include intragenomic variations can improve discrimination among closely related species, improve the utility of 16S rRNA databases, and facilitate rapid bacterial identification by targeted DNA sequence analysis or by whole-genome sequencing performed by clinical or reference laboratories.
In 1977, Woese and his colleagues introduced the 16S rRNA gene sequence for phylogenetic studies and, based on that sequence, proposed a Tree of Life composed of three domains of living organisms, i.e., Archaea, Bacteria, and Eukarya (1, 2). The domain of bacteria is by far the largest and continues to expand as diverse environments are analyzed (3). Bacterial 16S rRNA genes are located within the rRNA operons, which also contain genes for 23S rRNA, 5S rRNA, tRNA, and associated intergenic spacer regions. Since rRNAs are essential for survival, these operons are expected to be found on the chromosome. However, a recent report by Anda et al. (4) described a clade within the genus Aureimonas for which the sole rRNA operon is located on a small plasmid, which suggests that there is still more to be learned about rRNA in bacterial species. Although the DNA sequences of various rRNA genes and intergenic spacer regions have been used for identification to the genus or species level, the 16S rRNA gene is usually preferred. In addition to being universally distributed among bacteria, this gene contains both highly conserved and hypervariable regions, and there are large and constantly expanding databases of 16S rRNA gene sequences for comparison.Widespread use of DNA sequencing technologies in clinical, public health, and research laboratories has resulted in rapid and accurate molecular diagnostic methods. A bacterial isolate can now be identified more rapidly by 16S rRNA sequence analysis than by conventional methods. In addition to novel or unculturable bacteria, gene sequence analysis has been employed for identification of bacteria with unusual...