Here, we describe a multigenomic DNA sequence-analysis tool, EVOPRINTER, that facilitates the rapid identification of evolutionary conserved sequences within the context of a single species. The EVOPRINTER output identifies multispecies-conserved DNA sequences as they exist in a reference DNA. This identification is accomplished by superimposing multiple reference DNA vs. test-genome pairwise BLAT (BLAST-like alignment tool) readouts of the reference DNA to identify conserved nucleotides that are shared by all orthologous DNAs. EVOPRINTER analysis of well characterized genes reveals that most, if not all, of the conserved sequences are essential for gene function. For example, analysis of orthologous genes that are shared by many vertebrates identifies conserved DNA in both protein-encoding sequences and noncoding cis-regulatory regions, including enhancers and mRNA microRNA binding sites. In Drosophila, the combined mutational histories of five or more species affords near-base pair resolution of conserved transcription factor DNA-binding sites, and essential amino acids are revealed by the nucleotide flexibility of their codon-wobble position(s). Conserved small peptide-encoding genes, which had been undetected by conventional gene-prediction algorithms, are identified by the codon-wobble signatures of invariant amino acids. Also, EVOPRINTER allows one to assess the degree of evolutionary divergence between orthologous DNAs by highlighting differences between a selected species and the other test species.comparative genomics ͉ evolution ͉ gene structure and function D eciphering the regulatory mechanisms that control coordinate gene expression is a long-standing goal of biology. The comparison of orthologous DNA sequences from multiple vertebrate or invertebrate species promises to identify the cisregulatory elements that are central to the dynamic interplay between a gene and its transcriptional regulators (1-3). This cross-species comparison, termed phylogenetic footprinting, is based on the hypothesis that functionally important sequences evolve at a significantly slower rate than nonfunctional DNA (1). Phylogenetic footprinting has been used successfully to discover multispecies-conserved sequences (MCSs) that are critical for gene function (reviewed in refs. 2, 4, and 5). An essential first step in this process is the alignment of multiple orthologous DNAs. Multisequence-alignment programs include THREADED BLOCKSET ALIGNER (6), FOOTPRINTER (7), CONREAL (5), and PHYME (8). The multiDNA alignments are accomplished either by simultaneous or sequential pairwise alignments of input DNAs, with alignment gaps introduced to optimize the overall homology comparisons.Individual genome searches have also been commonly used to initiate MCS searches, and two popular whole-genome search algorithms are BLAST (9) and BLAT (BLAST-like alignment tool) (10). One significant difference between the BLAST and BLAT algorithms is that BLAT keeps an index of a species genome in memory and uses this index to scan linearly through ...