Streptococcus mutans is one of several members of the oral indigenous biota linked with severe early childhood caries (S-ECC). Because most humans harbor S. mutans, but not all manifest disease, it has been proposed that the strains of S. mutans associated with S-ECC are genetically distinct from those found in caries-free (CF) children. The objective of this study was to identify common DNA fragments from S. mutans present in S-ECC but not in CF children. Using suppressive subtractive hybridization, we found a number of DNA fragments (biomarkers) present in 88 to 95% of the S-ECC S. mutans strains but not in CF S. mutans strains. We then applied machine learning techniques including support vector machines and neural networks to identify the biomarkers with the most predictive power for disease status, achieving a 92% accurate classification of the strains as either S-ECC or CF associated. The presence of these gene fragments in 90 to 100% of the 26 S-ECC isolates tested suggested their possible functional role in the pathogenesis of S. mutans associated with dental caries.The mutans streptococci (MS) are strongly associated with dental caries by virtue of their metabolic, ecological, and epidemiological attributes (23,46). Among the MS, Streptococcus mutans appears to be a predominant bacterial species in the microbiota of preschool children with severe early childhood caries (S-ECC) (4-6, 49). Although the association between S. mutans and S-ECC seems convincing, most children colonized by S. mutans do not manifest the disease (8), suggesting that among other possibilities, S. mutans vary in their ability to initiate caries.In our previous study, we demonstrated that strains of S. mutans strains associated with S-ECC differ in their genomic composition compared to caries-free (CF) controls (42). Using the power of suppressive subtractive DNA hybridization (SSH), several unique gene segments were identified from a strain of S. mutans (AF199) that was isolated from a child with S-ECC. The presence of unique genetic loci among S. mutans strains is consistent with the recent work by Waterhouse and Russell (51), as they described the presence of "dispensable genes" distributed among strains of S. mutans. These segments include mobile genetic elements that are widely distributed in S. mutans (2) and have been shown to modulate sucrose (31) and melibiose metabolism (40). S. mutans strains also vary in content in terms of the presence of plasmids (10, 32), mutacin I, II, III, and IV operons (3,19,[35][36][37], serotypic antigens (43); competence (34), the comBCD genes (28), and gtfBC (14, 48, 52), among other genetic loci. Based on the wide diversity of genotypes and genetic loci within S. mutans, different strains of S. mutans apparently comprise both common and unique genetic loci, and it seems that these differences are unequally distributed among strains (42, 51). Identifying the unique DNA fragments that are common to most of the strains isolated from S-ECC but not CF children will be important even if their fun...