Prediction of protein-coding regions and other features of primary DNA sequence have greatly contributed to experimental biology. Significant challenges remain in genome annotation methods, including the identification of small or overlapping genes and the assessment of mRNA splicing or unconventional translation signals in expression. We have employed a combined analysis of compositional biases and conservation together with frame-specific G؉C representation to reevaluate and annotate the genome sequences of mouse and rat cytomegaloviruses. Our analysis predicts that there are at least 34 protein-coding regions in these genomes that were not apparent in earlier annotation efforts. These include 17 single-exon genes, three new exons of previously identified genes, a newly identified four-exon gene for a lectin-like protein (in rat cytomegalovirus), and 10 probable frameshift extensions of previously annotated genes. This expanded set of candidate genes provides an additional basis for investigation in cytomegalovirus biology and pathogenesis.Sequence analysis has been crucial to understanding the biology of cytomegalovirus (CMV) as well as other herpesviruses (13,30). Human CMV is an important pathogen, causing neurological damage following congenital infection (37) as well as opportunistic infections in immunocompromised individuals. Models of human CMV pathogenesis and immune control have employed related betaherpesviruses that naturally infect guinea pigs (42), rats (8), and mice (21, 25). The initial annotation of a laboratory-propagated human CMV strain, AD169 (10), predicted 194 unique open reading frames (ORFs). Following this report, reevaluation of genome organization has occurred through correction of errors in the AD169 strain sequence (12,31,35,43), recognition of mRNA splicing events (15, 39), and empirical identification of genes that had escaped annotation (3,24,26). The human CMV sequence has been updated through analyses of additional strains (9,16,17,33) as well as by comparison to rhesus CMV (20) and chimpanzee CMV (14) genome sequences. Several revisions of the full genome complement of natural CMV have resulted from these studies. The number of genes in human CMV was estimated to range from under 150 to over 200 genes, and the current estimate of 165 genes is considered reasonable (16). Different estimates depend on the information considered, including homology with other genes in available databases, codon bias, preservation of known protein motifs, and the presence of transcription signals (13, 32).The annotated human CMV (HCMV) genome sequence has formed a basis for comparisons to other betaherpesviruses. Murine CMV (MCMV) (40) and rat CMV (RCMV) (45) retain obvious sequence homologs of about 80 HCMV ORFs, or roughly 50% of the annotated genes in these viruses. Non-CMV betaherpesviruses infecting humans, such as herpesvirus 6 (19) and herpesvirus 7 (36), as well as those infecting lower primates, such as herpesvirus tupaia (2), retain similar core sets of ORFs. Approximately 40 of these ...