The genetic content of wild-type human cytomegalovirus was investigated by sequencing the 235 645 bp genome of a low passage strain (Merlin). Substantial regions of the genome (genes RL1-UL11, UL105-UL112 and UL120-UL150) were also sequenced in several other strains, including two that had not been passaged in cell culture. Comparative analyses, which employed the published genome sequence of a high passage strain (AD169), indicated that Merlin accurately reflects the wild-type complement of 165 genes, containing no obvious mutations other than a single nucleotide substitution that truncates gene UL128. A sizeable subset of genes exhibits unusually high variation between strains, and comprises many, but not all, of those that encode proteins known or predicted to be secreted or membrane-associated. In contrast to unpassaged strains, all of the passaged strains analysed have visibly disabling mutations in one or both of two groups of genes that may influence cell tropism. One comprises UL128, UL130 and UL131A, which putatively encode secreted proteins, and the other contains RL5A, RL13 and UL9, which are members of the RL11 glycoprotein gene family. The case in support of a lack of protein-coding potential in the region between UL105 and UL111A was also strengthened.
Deep sequencing was used to bring high resolution to the human cytomegalovirus (HCMV) transcriptome at the stage when infectious virion production is under way, and major findings were confirmed by extensive experimentation using conventional techniques. The majority (65.1%) of polyadenylated viral RNA transcription is committed to producing four noncoding transcripts (RNA2.7, RNA1.2, RNA4.9, and RNA5.0) that do not substantially overlap designated protein-coding regions. Additional noncoding RNAs that are transcribed antisense to protein-coding regions map throughout the genome and account for 8.7% of transcription from these regions. RNA splicing is more common than recognized previously, which was evidenced by the identification of 229 potential donor and 132 acceptor sites, and it affects 58 proteincoding genes. The great majority (94) of 96 splice junctions most abundantly represented in the deep-sequencing data was confirmed by RT-PCR or RACE or supported by involvement in alternative splicing. Alternative splicing is frequent and particularly evident in four genes (RL8A, UL74A, UL124, and UL150A) that are transcribed by splicing from any one of many upstream exons. The analysis also resulted in the annotation of four previously unrecognized protein-coding regions (RL8A, RL9A, UL150A, and US33A), and expression of the UL150A protein was shown in the context of HCMV infection. The overall conclusion, that HCMV transcription is complex and multifaceted, has implications for the potential sophistication of virus functionality during infection. The study also illustrates the key contribution that deep sequencing can make to the genomics of nuclear DNA viruses.T he genetic repertoire of human cytomegalovirus (HCMV; species Human herpesvirus 5) is incompletely understood. Most bioinformatic investigations have focused on identifying open reading frames (ORFs) that are conserved in other organisms or that exhibit pattern-based similarities (e.g., in nucleotide or codon bias) to recognized protein-coding regions (CRs) (1). Our current map of the wild-type HCMV genome, based on strain Merlin, contains 166 protein-coding genes (2-5). It is entirely possible that additional, small protein-coding genes will be found. Candidates involve ORFs that overlap recognized CRs and for which there is some evidence for expression (6), ORFs highlighted in pattern-based bioinformatic (7) or proteomic analyses (8), and ORFs whose expression is presently unsuspected.Recognition of many protein-coding genes has been supplemented by information on protein expression and function. However, HCMV also specifies polyadenylated (polyA) transcripts that, because they lack sizeable, conserved ORFs, appear unlikely to function via translation. One class consists of noncoding, nonoverlapping transcripts (NNTs) that do not substantially overlap the designated CRs of other genes. These include an abundant 2.7-kb RNA (β2.7 or RNA2.7) (9), a 1.1-kb spliced RNA and associated 5-kb stable intron (RNA5.0) (10, 11), and a 1.2-kb RNA (RNA1.2) (12). RNA...
BackgroundNrd1 and Nab3 are essential sequence-specific yeast RNA binding proteins that function as a heterodimer in the processing and degradation of diverse classes of RNAs. These proteins also regulate several mRNA coding genes; however, it remains unclear exactly what percentage of the mRNA component of the transcriptome these proteins control. To address this question, we used the pyCRAC software package developed in our laboratory to analyze CRAC and PAR-CLIP data for Nrd1-Nab3-RNA interactions.ResultsWe generated high-resolution maps of Nrd1-Nab3-RNA interactions, from which we have uncovered hundreds of new Nrd1-Nab3 mRNA targets, representing between 20 and 30% of protein-coding transcripts. Although Nrd1 and Nab3 showed a preference for binding near 5′ ends of relatively short transcripts, they bound transcripts throughout coding sequences and 3′ UTRs. Moreover, our data for Nrd1-Nab3 binding to 3′ UTRs was consistent with a role for these proteins in the termination of transcription. Our data also support a tight integration of Nrd1-Nab3 with the nutrient response pathway. Finally, we provide experimental evidence for some of our predictions, using northern blot and RT-PCR assays.ConclusionsCollectively, our data support the notion that Nrd1 and Nab3 function is tightly integrated with the nutrient response and indicate a role for these proteins in the regulation of many mRNA coding genes. Further, we provide evidence to support the hypothesis that Nrd1-Nab3 represents a failsafe termination mechanism in instances of readthrough transcription.
Acellular materials of xenogenic origin are used worldwide as xenografts and Phase I trials of viable pig pancreatic islets are currently being performed. However, limited information is available on transmission of porcine endogenous retrovirus (PERV) after xenotransplantation and on the long-term immune response of recipients to xenoantigens. We analyzed the blood of burn patients who had received living pig skin dressings for up to 8 weeks for the presence of PERV as well as for the level and nature of their long term (maximum 34 years) immune response against pig antigens. Whilst no evidence of PERV genomic material or anti PERV antibody response was found, we observed a moderate increase in anti αGal antibodies and a high and sustained anti non-αGal IgG response in those patients. Antibodies against the non-human sialic acid Neu5Gc constituted the anti non-αGal response with the recognition pattern on a sialogly can array differing from that of burn patients treated without pig skin. These data suggest that anti-Neu5Gc antibodies may represent a barrier for long-term acceptance of porcine xenografts. As anti-Neu5Gc antibodies can promote chronic inflammation, the long-term safety of living and acellular pig tissue implants in recipients warrants further evaluation.
Heterozygous mutations in the X-linked MECP2 gene cause the profound neurological disorder Rett syndrome (RTT)1. MeCP2 protein is an epigenetic reader whose binding to chromatin primarily depends on 5-methylcytosine (mC)2,3. Functionally, MeCP2 has been implicated in several cellular processes based on its reported interaction with >40 binding partners4, including transcriptional co-repressors (e.g. the NCoR/SMRT complex5), transcriptional activators6, RNA7, chromatin remodellers8,9, microRNA-processing proteins10 and splicing factors11. Accordingly, MeCP2 has been cast as a multi-functional hub that integrates diverse processes that are essential in mature neurons12. At odds with the concept of broad functionality, missense mutations that cause RTT are concentrated in two discrete clusters coinciding with interaction sites for partner macromolecules: the Methyl-CpG Binding Domain (MBD)13 and the NCoR/SMRT Interaction Domain (NID)5. Here, we test the hypothesis that the single dominant function of MeCP2 is to physically connect DNA with the NCoR/SMRT complex, by removing almost all amino acid sequences except the MBD and NID. We find that mice expressing truncated MeCP2 lacking both the N- and C-terminal regions (approximately half of the native protein) are phenotypically near-normal; and those expressing a minimal MeCP2 additionally lacking a central domain survive for over one year with only mild symptoms. This minimal protein is able to prevent or reverse neurological symptoms when introduced into MeCP2-deficient mice by genetic activation or virus-mediated delivery to the brain. Thus, despite evolutionary conservation of the entire MeCP2 protein sequence, the DNA and co-repressor binding domains alone are sufficient to avoid RTT-like defects and may therefore have therapeutic utility.
Ribosome assembly in eukaryotes involves the activity of hundreds of assembly factors that direct the hierarchical assembly of ribosomal proteins and numerous ribosomal RNA folding steps. However, detailed insights into the function of assembly factors and ribosomal RNA folding events are lacking. To address this, we have developed ChemModSeq, a method that combines structure probing, high-throughput sequencing and statistical modeling, to quantitatively measure RNA structural rearrangements during the assembly of macromolecular complexes. By applying ChemModSeq to purified 40S assembly intermediates we obtained nucleotide-resolution maps of ribosomal RNA flexibility revealing structurally distinct assembly intermediates and mechanistic insights into assembly dynamics not readily observed in cryo-electron microscopy reconstructions. We show that RNA restructuring events coincide with the release of assembly factors and predict that completion of the head domain is required before the Rio1 kinase enters the assembly pathway. Collectively, our results suggest that 40S assembly factors regulate the timely incorporation of ribosomal proteins by delaying specific folding steps in the 3′ major domain of the 20S pre-ribosomal RNA.
BackgroundRNA levels detected at steady state are the consequence of multiple dynamic processes within the cell. In addition to synthesis and decay, transcripts undergo processing. Metabolic tagging with a nucleotide analog is one way of determining the relative contributions of synthesis, decay and conversion processes globally.ResultsBy improving 4-thiouracil labeling of RNA in Saccharomyces cerevisiae we were able to isolate RNA produced during as little as 1 minute, allowing the detection of nascent pervasive transcription. Nascent RNA labeled for 1.5, 2.5 or 5 minutes was isolated and analyzed by reverse transcriptase-quantitative polymerase chain reaction and RNA sequencing. High kinetic resolution enabled detection and analysis of short-lived non-coding RNAs as well as intron-containing pre-mRNAs in wild-type yeast. From these data we measured the relative stability of pre-mRNA species with different high turnover rates and investigated potential correlations with sequence features.ConclusionsOur analysis of non-coding RNAs reveals a highly significant association between non-coding RNA stability, transcript length and predicted secondary structure. Our quantitative analysis of the kinetics of pre-mRNA splicing in yeast reveals that ribosomal protein transcripts are more efficiently spliced if they contain intron secondary structures that are predicted to be less stable. These data, in combination with previous results, indicate that there is an optimal range of stability of intron secondary structures that allows for rapid splicing.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-015-0848-1) contains supplementary material, which is available to authorized users.
Mutations in the X-linked Cyclin-Dependent Kinase-Like 5 gene (CDKL5) cause early onset infantile spasms and subsequent severe developmental delay in affected children. Deleterious mutations have been reported to occur throughout the CDKL5 coding region. Several studies point to a complex CDKL5 gene structure in terms of exon usage and transcript expression. Improvements in molecular diagnosis and more extensive research into the neurobiology of CDKL5 and pathophysiology of CDKL5 disorders necessitate an updated analysis of the gene. In this study, we have analysed human and mouse CDKL5 transcript patterns both bioinformatically and experimentally. We have characterised the predominant brain isoform of CDKL5, a 9.7 kb transcript comprised of 18 exons with a large 6.6 kb 3’-untranslated region (UTR), which we name hCDKL5_1. In addition we describe new exonic regions and a range of novel splice and UTR isoforms. This has enabled the description of an updated gene model in both species and a standardised nomenclature system for CDKL5 transcripts. Profiling revealed tissue- and brain development stage-specific differences in expression between transcript isoforms. These findings provide an essential backdrop for the diagnosis of CDKL5-related disorders, for investigations into the basic biology of this gene and its protein products, and for the rational design of gene-based and molecular therapies for these disorders.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.