To further unravel the mechanisms responsible for attenuation of the tuberculosis vaccine Mycobacterium bovis BCG, comparative genomics was used to identify single nucleotide polymorphisms (SNPs) that differed between sequenced strains of Mycobacterium bovis and M. bovis BCG. SNPs were assayed in M. bovis isolates from France and the United Kingdom and from different BCG vaccines in order to identify those that arose during the attenuation process which gave rise to BCG. Informative data sets were obtained for 658 SNPs from 21 virulent M. bovis strains and 13 BCG strains; these SNPs showed phylogenetic clustering that was consistent with the geographical origin of the strains and previous schemes for BCG genealogies. The data revealed a closer relationship between BCG Tice and BCG Pasteur than was previously appreciated, while we were able to position BCG Beijing within a grouping of BCG Denmark-derived strains. Only 186 SNPs were identified between virulent M. bovis strains and all BCG strains, with 115 nonsynonymous SNPs affecting important functions such as global regulators, transcriptional factors, and central metabolism, which might impact on virulence. We therefore refine previous genealogies of BCG vaccines and define a minimal set of SNPs between virulent M. bovis strains and the attenuated BCG strain that will underpin future functional analyses.Mycobacterium bovis bacillus Calmette-Guérin (BCG) is the only vaccine available against tuberculosis and is the most widely used vaccine in the world. It was derived by the repeated subculture of a strain of Mycobacterium bovis on potato slices soaked in glycerol and ox bile (10), leading to the in vitro accumulation of mutations and ultimately attenuation. Despite the widespread use of BCG, the precise genetic lesions that led to attenuation are not defined. Furthermore, the success of BCG led to its distribution from the Institut Pasteur to laboratories around the world, each of which continued the subculturing process, thereby leading to the generation of a number of daughter strains named after their geographical origin (hence BCG Tokyo, BCG Russia, etc.). The protective efficacy of these strains has been shown to vary in both laboratory models and epidemiological studies (6,18,36).As BCG is the only vaccine currently available against tuberculosis, there is a clear need to understand the molecular basis of attenuation and variable efficacy afforded by BCG. The first study that attempted to identify mutations linked to attenuation was performed by Mahairas and colleagues, who identified three deletions, RD1 to RD3, from the genome of BCG strain Connaught (39). The RD1 locus was shown to be deleted from all BCG strains but present in all virulent strains of M. bovis and Mycobacterium tuberculosis studied. Subsequent work has shown that this deletion played a major role in the attenuation of BCG (38, 46). However, complementation of BCG with RD1 does not restore virulence to wild-type levels, suggesting that other attenuating mutations exist. Indeed, all BCG...