C lassification of living organisms, and the relationship among species, is one of the most enduring pursuits in human beings' efforts to understand their place in the world. Formalized in the early 18 th century by Carl Linnaeus, the phylogeny, or relationship of plant and animals has historically been based upon morphological features. With the advent of rapid and inexpensive DNA sequencing and the development of bioinformatics tools for comparing the genetic sequences from different organisms, we are now in the era of molecular phylogeny. Molecular characterization of both the relationship between species, and among organisms within the same species, has significant implications for public health.Phylogenetic analysis employs the cumulative similarities and weighted polymorphisms found in collections of sequences to build a graphical representation of genetic relatedness. An illustrative example of how a phylogenetic analysis can be shown to infer relationships is presented in Figure 1. Suppose that viral sequences from seven different infected individuals are aligned on the right. One can easily observe that while all seven viruses are identical at the majority of base-pair positions, single nucleotide differences occurring at two positions allow grouping into three phylogenetically related clusters. Each of these monophyletic clusters is in turn linked through an inferred "most recent common ancestors" at the branch points between clusters. In order to accurately resolve the relative relatedness of sequences under investigation, it is critical that care is taken when selecting a suitable set of background sequences for inclusion in the analysis. Similar to the way in which using cats as a reference for comparing different dogs would highlight only the similarities in all canine breeds, using only distantly related reference sequences may cause investigational sequences to appear more highly related. While the basic methods of comparing genetic sequences and quantifying the certainty of the predicted relationships was developed in the 1960s, with the advent of widely available DNA sequencing, ever more sophisticated methods were developed. In the past few years, powerful desktop computers and freely available bioinformatics programs have allowed phylogenetics to become an important tool in increasing our understanding of the epidemiology of infections.
1Application of phylogenetic tools in the context of epidemiology have led to startling revelations and helped to answer and illuminate previously unanswerable questions. The paradigm of the management of TB changed after the publication of an article in the Lancet in 1999.2 It was traditionally assumed that people with TB disease who had no organisms seen on a sputum smear were relatively non-infectious. In this report, Behr and colleagues used DNA fingerprinting on all of the TB cases in San Francisco between 1991 and 1996, to identify clusters of related infections. Through the use of molecular epidemiology, the authors determined that nearly 20% of all TB ...