Molecular typing ofMycobacterium tuberculosis by using IS6110 shows low discrimination when there are fewer than five copies of the insertion sequence. Using a collection of such isolates from a study of the epidemiology of tuberculosis in London, we have shown a substantial degree of congruence between IS6110 patterns and both spoligotype and PGRS type. This indicates that the IS6110 types mainly represent distinct families of strains rather than arising through the convergent insertion of IS6110 into favored positions. This is supported by identification of the genomic sites of the insertion of IS6110 in these strains. The combined data enable identification of the putative evolutionary relationships of these strains, comprising three lineages broadly associated with patients born in South Asia (India and Pakistan), Africa, and Europe, respectively. These lineages appear to be quite distinct from M. tuberculosis isolates with multiple copies of IS6110.The international standard method for typing Mycobacterium tuberculosis depends on the polymorphism detected with the insertion sequence IS6110 (17,41,47). In most populations, multicopy strains (with five or more copies of IS6110) form a substantial majority of isolates of M. tuberculosis (24,33,40,42). Isolates with only a few copies of IS6110 show much less polymorphism, necessitating the use of additional typing methods such as spoligotyping (20) or PGRS typing (7). The implied assumption behind the use of secondary typing to define clusters of low-copy-number isolates is that the combined rate of variation for the low-copy-number isolates (the product of the two molecular clocks) is equivalent to the single rate of variation of the IS6110 pattern in the multiple-copynumber strains.The reduced polymorphism of IS6110 in low-copy-number strains is assumed to reflect the occupation of a limited number of chromosomal sites by the insertion sequence in such strains. One hypothesis is that this is due to frequent independent transposition into the same chromosomal sites (which would be true hot spots). An alternative hypothesis is that the low degree of polymorphism arises from a lack of mobility of IS6110 in such strains (i.e., the IS6110 molecular clock operates more slowly) (5, 43). The first hypothesis would lead to the prediction that additional independent typing methods would yield widely dispersed results. However, if the second hypothesis were true, independent typing methods would be predicted to show substantial congruence.In addition, the assumption that a limited number of chromosomal sites are occupied in low-copy-number strains needs to be tested. Several studies have shown that the overall distribution of IS6110 inserts is nonrandom, either by determining band sizes (13,25) or by examining the occurrence of inserts in specific regions of the genome (IS6110 preferential loci) (8-10, 44). Microarrays have also been used (22) to locate the insertions of IS6110 to such regions or to specific genes. For studies of evolutionary relationships, it is necess...