We have determined the DNA sequence of the long unique region (UL) in the genome of herpes simplex virus type 1 (HSV-1) strain 17. The UL sequence contained 107943 residues and had a base composition of 66-9~ G+C. Together with our previous work, this completes the sequence of HSV-1 DNA, giving a total genome length of 152260 residues of base composition 68.3~ G+C. Genes in the UL region were located by the use of published mapping analyses, transcript structures and sequence data, and by examination of DNA sequence characteristics. Fifty-six genes were identified, accounting for most of the sequence. Some 28 of these are at present of unknown function. The gene layout for UL was found to be very similar to that for the corresponding part of the genome of varicella-zoster virus, the only other completely sequenced alphaherpesvirus, and the amino acid sequences of equivalent proteins showed a range of similarities. In the whole genome of HSV-1 we now recognize 72 genes which encode 70 distinct proteins. INTRODUCTION In the last decade, the study of animal viruses has been revolutionized by the application of nucleic acid sequencing techniques to viral genomes. Many smaller virus genomes have been completely sequenced, and the sequences interpreted to give high resolution views of the genetic organization and the nature of the encoded proteins, while comparisons of sequences have enhanced our understanding of relationships between viruses. For larger virus genomes, total determination of nucleotide sequence remains a formidable undertaking, and only two complete sequences of virus genomes larger than 105 residues have been published. These are for the gammaherpesvirus Epstein-Barr virus (EBV) of 172282 residues (Baer et al., 1984) and the alphaherpesvirus varicella-zoster virus (VZV) of 124884 residues (Davison & Scott, 1986a). In this paper we report a third complete herpesvirus genome sequence, that of herpes simplex virus type 1 (HSV-1), which comprises 152260 residues. The molecular biology and genetics of HSV types 1 and 2 have been widely investigated such that overall they are the most extensively characterized of the family Herpesviridae. A decade ago, studies on the structure of HSV DNA showed it to be a linear molecule which could be viewed as consisting of two covalently linked segments, designated long (L) and short (S). Each segment contains a unique sequence flanked by a pair of inverted repeat sequences, as shown in Fig. 1. The long repeat (RL) and short repeat (Rs) sequences are distinct. The molecule also
The genetic content of wild-type human cytomegalovirus was investigated by sequencing the 235 645 bp genome of a low passage strain (Merlin). Substantial regions of the genome (genes RL1-UL11, UL105-UL112 and UL120-UL150) were also sequenced in several other strains, including two that had not been passaged in cell culture. Comparative analyses, which employed the published genome sequence of a high passage strain (AD169), indicated that Merlin accurately reflects the wild-type complement of 165 genes, containing no obvious mutations other than a single nucleotide substitution that truncates gene UL128. A sizeable subset of genes exhibits unusually high variation between strains, and comprises many, but not all, of those that encode proteins known or predicted to be secreted or membrane-associated. In contrast to unpassaged strains, all of the passaged strains analysed have visibly disabling mutations in one or both of two groups of genes that may influence cell tropism. One comprises UL128, UL130 and UL131A, which putatively encode secreted proteins, and the other contains RL5A, RL13 and UL9, which are members of the RL11 glycoprotein gene family. The case in support of a lack of protein-coding potential in the region between UL105 and UL111A was also strengthened.
gHerpes simplex virus 1 (HSV-1) causes a chronic, lifelong infection in >60% of adults. Multiple recent vaccine trials have failed, with viral diversity likely contributing to these failures. To understand HSV-1 diversity better, we comprehensively compared 20 newly sequenced viral genomes from China, Japan, Kenya, and South Korea with six previously sequenced genomes from the United States, Europe, and Japan. In this diverse collection of passaged strains, we found that one-fifth of the newly sequenced members share a gene deletion and one-third exhibit homopolymeric frameshift mutations (HFMs). Individual strains exhibit genotypic and potential phenotypic variation via HFMs, deletions, short sequence repeats, and single-nucleotide polymorphisms, although the protein sequence identity between strains exceeds 90% on average. In the first genome-scale analysis of positive selection in HSV-1, we found signs of selection in specific proteins and residues, including the fusion protein glycoprotein H. We also confirmed previous results suggesting that recombination has occurred with high frequency throughout the HSV-1 genome. Despite this, the HSV-1 strains analyzed clustered by geographic origin during whole-genome distance analysis. These data shed light on likely routes of HSV-1 adaptation to changing environments and will aid in the selection of vaccine antigens that are invariant worldwide. Herpes simplex virus 1 (HSV-1; species Human herpesvirus 1, genus Simplexvirus, subfamily Alphaherpesvirinae, family Herpesviridae, order Herpesvirales) is among the most successful human pathogens in terms of its global distribution, longevity in the host, and mild symptoms among the great majority of those exposed (1-4). HSV-1 is a large, enveloped DNA virus that infects lytically at epithelial surfaces and establishes a lifelong, latent infection in sensory neurons. HSV-1 infection produces a wide range of symptoms, ranging from few or none in many seropositive individuals to periodic lesions on epithelial surfaces in a significant proportion of people and to lethal encephalitis as an extreme manifestation in a few. There is no vaccine at present (5, 6). Studies in animal models have characterized the ways in which genetic variation between viral strains can influence the symptoms of pathology, including lesion severity and rates of reactivation from latency. The most recent phase III vaccine trial for HSV failed to provide protection from infection (7, 8), and one contributing factor to this failure may well be variation among HSV isolates found in the field.Based on early restriction fragment length polymorphism (RFLP) analyses, HSV-1 has been described as more diverse than HSV-2 (9-11). In contrast to both HSV-1 and HSV-2, the related human alphaherpesvirus, varicella-zoster virus (VZV), has relatively low interstrain diversity (12-15). Decades of research comparing RFLP bands, polypeptide size, and PCR-based sequence analysis have revealed that HSV-1 strains vary between individuals, over sequential isolates from the s...
We report the complete DNA sequence of the short repeat region in the genome of herpes simplex virus type 1, as 6633 base pairs of composition 79.5% G+C. This contains immediate early gene 3, encoding the IE175 protein, an important transcriptional activator of later virus genes. The IE175 coding region was identified as a 3894 base sequence of 81.5% G+C DNA. The base composition of this gene is thus the most extreme yet determined, and the IE175 predicted amino acid composition is correspondingly biased, most notably with an alanine content of 20.9%. Functionally important regions of the IE175 polypeptide were tentatively identified by comparison with the sequence of the homologous protein from varicella-zoster virus and from locations of ts mutations, and were correlated with properties of the amino acid sequence. Aspects of the evolution of such an extreme composition DNA sequence were discussed.
The gene complement of wild-type human cytomegalovirus (HCMV) is incompletely understood, on account of the size and complexity of the viral genome and because laboratory strains have undergone deletions and rearrangements during adaptation to growth in culture. We have determined the sequence (241 087 bp) of chimpanzee cytomegalovirus (CCMV) and have compared it with published HCMV sequences from the laboratory strains AD169 and Toledo, with the aim of clarifying the gene content of wild-type HCMV. The HCMV and CCMV genomes are moderately diverged and essentially collinear. On the basis of conservation of potential proteincoding regions and other sequence features, we have discounted 51 previously proposed HCMV ORFs, modified the interpretations for 24 (including assignments of multiple exons) and proposed ten novel genes. Several errors were detected in the published HCMV sequences. We presently recognize 165 genes in CCMV and 145 in AD169; this compares with an estimate of 189 unique genes for AD169 made in 1990. Our best estimate for the complement of wild-type HCMV is 164 to 167 genes. INTRODUCTIONHuman cytomegalovirus (HCMV; human herpesvirus 5) is ubiquitous and largely inapparent, but poses a risk of serious disease to those lacking a competent immune system, such as neonates, transplant patients and sufferers from AIDS (reviewed in Pass, 2001). HCMV is the prototype of subfamily Betaherpesvirinae, and is the most complex of the eight human herpesvirus species. HCMV is isolated routinely on human fibroblast cell lines, and several strains in common laboratory use, such as AD169 and Towne, were derived by multiple passages on such cells (reviewed in Mocarski & Tan Courcelle, 2001).The linear, double-stranded DNA genome of AD169 comprises two covalently linked segments (L and S), each consisting of a unique region (U L and U S ) flanked by an inverted repeat (TR L and IR L , TR S and IR S ), yielding the overall genome configuration TR L -U L -IR L -IR S -U S -TR S (reviewed in Mocarski & Tan Courcelle, 2001). In addition, the genome is terminally redundant, possessing a short region (the a sequence) as a direct repeat at the termini and also in inverse orientation at the IR L -IR S junction. Some genomes contain tandemly reiterated copies of the a sequence at these locations. U L and U S can invert relative to each other by recombination between inverted repeats in replicating DNA, resulting in four equimolar genome arrangements in virion DNA. The complete DNA sequence of AD169 was published in a seminal paper by Chee et al. (1990), and at that time was the largest viral genome sequence available. The total genome size was 229 354 bp, with U L being 166 972 bp, U S 35 418 bp, R L (a collective term for TR L and IR L ) 11 247 bp, R S (TR S and IR S ) 2524 bp and the a sequence (part of R L and R S in the sizes given above) 578 bp.As a primary criterion for identifying protein-coding regions, Chee et al. (1990) focused on open reading frames (ORFs) of 100 or more contiguous amino acidencoding codons that ov...
With the aim of deriving a definitive phylogenetic tree for as many mammalian and avian herpesvirus species as possible, alignments were made of amino acid sequences from eight conserved and ubiquitously present genes of herpesviruses, with 48 virus species each represented by at least one gene. Phylogenetic trees for both single-gene and concatenated alignments were evaluated thoroughly by maximum-likelihood methods, with each of the three herpesvirus subfamilies (the Alpha-, Beta-, and Gammaherpesvirinae) examined independently. Composite trees were constructed starting with the top-scoring tree based on the broadest set of genes and supplemented by addition of virus species from trees based on narrower gene sets, to give finally a 46-species tree; branching order for three regions within the tree remained unresolved. Sublineages of the Alpha-and Betaherpesvirinae showed extensive cospeciation with host lineages by criteria of congruence in branching patterns and consistency in extent of divergence. The Gammaherpesvirinae presented a more complex picture, with both higher and lower substitution rates in different sublineages. The final tree obtained represents the most detailed view to date of phylogenetic relationships in any family of large-genome viruses.The Herpesviridae are a numerous family of large DNA viruses which have as their natural hosts humans, other mammals and vertebrates, and in one described case, an invertebrate (11, 16). The genomes of herpesviruses of mammals and birds clearly evince descent from a common ancestor, but with a great range of variation in terms of nucleotide substitution, gene content, and genomic arrangement (15). The Herpesviridae have been divided into three subfamilies, the Alpha-, Beta-, and Gammaherpesvirinae, initially from their distinct biological properties and latterly more precisely on the basis of their genomic attributes (16). Over the last two decades an extensive body of herpesvirus DNA sequence data has been built up, from single-gene analyses to studies of whole genomes (in the range 120 to 240 kbp). Phylogenetic studies using herpesvirus sequences have been undertaken, demonstrating clear division into the three subfamilies and, in some sublineages, patterns of divergence consistent with cospeciation of virus and host (7,9,13,14). Herpesviruses of fish (2, 3), amphibians (4), and invertebrates (A. J. Davison, personal communication) are only remotely related to the mammalian and avian viruses, while certain turtle viruses (the only reptile herpesviruses for which some sequence is known) probably group with the mammalian and avian viruses (18).We describe in this report a major update of herpesvirus phylogenetic analysis, using the greatly increased number of gene sequences now available from a wide range of mammalian and avian herpesviruses, and enabled by advances both in processing power of modern computers and in methods for analysis of relationships among gene sequences. We aimed to produce by good current practice a single phylogenetic tree that would...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.