Pervasive and hidden transcription is widespread in eukaryotes, but its global level, the mechanisms from which it originates and its functional significance are unclear. Cryptic unstable transcripts (CUTs) were recently described as a principal class of RNA polymerase II transcripts in Saccharomyces cerevisiae. These transcripts are targeted for degradation immediately after synthesis by the action of the Nrd1-exosome-TRAMP complexes. Although CUT degradation mechanisms have been analysed in detail, the genome-wide distribution at the nucleotide resolution and the prevalence of CUTs are unknown. Here we report the first high-resolution genomic map of CUTs in yeast, revealing a class of potentially functional CUTs and the intrinsic bidirectional nature of eukaryotic promoters. An RNA fraction highly enriched in CUTs was analysed by a 3' Long-SAGE (serial analysis of gene expression) approach adapted to deep sequencing. The resulting detailed genomic map of CUTs revealed that they derive from extremely widespread and very well defined transcription units and do not result from unspecific transcriptional noise. Moreover, the transcription of CUTs predominantly arises within nucleosome-free regions, most of which correspond to promoter regions of bona fide genes. Some of the CUTs start upstream from messenger RNAs and overlap their 5' end. Our study of glycolysis genes, as well as recent results from the literature, indicate that such concurrent transcription is potentially associated with regulatory mechanisms. Our data reveal numerous new CUTs with such a potential regulatory role. However, most of the identified CUTs corresponded to transcripts divergent from the promoter regions of genes, indicating that they represent by-products of divergent transcription occurring at many and possibly most promoters. Eukaryotic promoter regions are thus intrinsically bidirectional, a fundamental property that escaped previous analyses because in most cases divergent transcription generates short-lived unstable transcripts present at very low steady-state levels.
Despite intense investigation, human replication origins and termini remain elusive. Existing data have shown strong discrepancies. Here we sequenced highly purified Okazaki fragments from two cell types and, for the first time, quantitated replication fork directionality and delineated initiation and termination zones genome-wide. Replication initiates stochastically, primarily within non-transcribed, broad (up to 150 kb) zones that often abut transcribed genes, and terminates dispersively between them. Replication fork progression is significantly co-oriented with the transcription. Initiation and termination zones are frequently contiguous, sometimes separated by regions of unidirectional replication. Initiation zones are enriched in open chromatin and enhancer marks, even when not flanked by genes, and often border ‘topologically associating domains' (TADs). Initiation zones are enriched in origin recognition complex (ORC)-binding sites and better align to origins previously mapped using bubble-trap than λ-exonuclease. This novel panorama of replication reveals how chromatin and transcription modulate the initiation process to create cell-type-specific replication programs.
Non-coding (nc)RNAs are key players in numerous biological processes such as gene regulation, chromatin domain formation and genome stability. Large ncRNAs interact with histone modifiers and are involved in cancer development, X-chromosome inactivation and autosomal gene imprinting. However, despite recent evidence showing that pervasive transcription is more widespread than previously thought, only a few examples mediating gene regulation in eukaryotes have been described. In Saccharomyces cerevisiae, the bona-fide regulatory ncRNAs are destabilized by the Xrn1 5'-3' RNA exonuclease (also known as Kem1), but the genome-wide characterization of the entire regulatory ncRNA family remains elusive. Here, using strand-specific RNA sequencing (RNA-seq), we identify a novel class of 1,658 Xrn1-sensitive unstable transcripts (XUTs) in which 66% are antisense to open reading frames. These transcripts are polyadenylated and RNA polymerase II (RNAPII)-dependent. The majority of XUTs strongly accumulate in lithium-containing media, indicating that they might have a role in adaptive responses to changes in growth conditions. Notably, RNAPII chromatin immunoprecipitation followed by DNA sequencing (ChIP-seq) analysis of Xrn1-deficient strains revealed a significant decrease of RNAPII occupancy over 273 genes with antisense XUTs. These genes show an unusual bias for H3K4me3 marks and require the Set1 histone H3 lysine 4 methyl-transferase for silencing. Furthermore, abolishing H3K4me3 triggers the silencing of other genes with antisense XUTs, supporting a model in which H3K4me3 antagonizes antisense ncRNA repressive activity. Our results demonstrate that antisense ncRNA-mediated regulation is a general regulatory pathway for gene expression in S. cerevisiae.
Long non-protein coding RNAs (npcRNA) represent an emerging class of riboregulators, which either act directly in this long form or are processed to shorter miRNA and siRNA. Genome-wide bioinformatic analysis of full-length cDNA databases identified 76 Arabidopsis npcRNAs. Fourteen npcRNAs were antisense to protein-coding mRNAs, suggesting cis-regulatory roles. Numerous 24-nt siRNA matched to five different npcRNAs, suggesting that these npcRNAs are precursors of this type of siRNA. Expression analyses of the 76 npcRNAs identified a novel npcRNA that accumulates in a dcl1 mutant but does not appear to produce trans-acting siRNA or miRNA. Additionally, another npcRNA was the precursor of miR869 and shown to be up-regulated in dcl4 but not in dcl1 mutants, indicative of a young miRNA gene. Abiotic stress altered the accumulation of 22 npcRNAs among the 76, a fraction significantly higher than that observed for the RNA binding protein-coding fraction of the transcriptome. Overexpression analyses in Arabidopsis identified two npcRNAs as regulators of root growth during salt stress and leaf morphology, respectively. Hence, together with small RNAs, long npcRNAs encompass a sensitive component of the transcriptome that have diverse roles during growth and differentiation.[Supplemental material is available online at www.genome.org.]Non-protein coding RNAs (npcRNAs) are a class of RNAs that do not encode proteins, but instead their function lies on the RNA molecule. They are a heterogeneous group and have been divided into different classes according to their length and function. With respect to length, npcRNAs can range from 20 to 27 nucleotides (nt) for the families of microRNAs (miRNAs) and small interfering RNAs (siRNAs), 20-300 nt for small RNAs commonly found as transcriptional and translational regulators, or up to and beyond 10,000 nt for medium and large RNAs involved in other processes, including splicing, gene inactivation, and translation (Costa 2007). We use the term non-protein-coding RNAs instead of noncoding RNAs as every sequence has the potential to be coding, and certain large npcRNAs might encode small oligopeptides, which could be translated under specific conditions as shown for a pentapeptide located inside rRNA, a canonical RNA in Escherichia coli (Tenson et al. 1996). In recent years, numerous novel npcRNA candidates have been identified in a variety of organisms from E. coli to Homo sapiens (Argaman et al. 2001;Storz et al. 2004;Washietl et al. 2005).Several strategies have been employed to detect and discover novel npcRNAs, including both experimental and computational screenings (Huttenhofer et al. 2002). Genomic approaches, such as tiling arrays and systematic sequencing of full-length cDNA libraries, in model organisms have recently revealed that much larger portions of eukaryote transcriptomes represent nonprotein-coding transcripts than previously believed (Okazaki et al. 2002;Numata et al. 2003;Rinn et al. 2003;Ota et al. 2004;Chekanova et al. 2007). Diverse npcRNAs, including a surpris...
The mRNA of bacteriophage T4 contains a strikingly abundant intercistronic hairpin. Within the 55 kilobases of known T4 sequence, the hexanucleotide sequence CTTCGG is found 13 times in the DNA strand equivalent to mRNA sequences. In 12 of those occurrences, the sequence is flanked by inverted repeats predictive of RNA hairpins with UUCG in the loop. Avian myeloblastosis virus reverse transcriptase, which can traverse hairpins of larger calculated stability, terminates efficiently at these CUUCGG hairpins.Thermal denaturation studies of model hairpins show that the loop sequence UUCG dramatically stabilizes RNA hairpins when compared to a control sequence. These data, when combined with previously described parameters of helix stability, suggest that T4 has utilized this loop sequence to optimize the stability of intercistronic hairpins. Fig. 1. Oligonucleotide primers were synthesized, purified, labeled at the 5' end with 32P, and used in reactions with avian myeloblastosis virus (AMV) reverse transcriptase as described (6) or in RNA blot analysis of Ml RNA as described by Maniatis et al. (7).Computer Searches. Most of the T4 sequences were in GenBank release 42.0. The sequences for gene 39, pseT, and the sequence including gene 52 and the end of rIIB (SaA9) have been published (8,9,36). All sequence manipulations and searches in Boulder used programs from the Delila system (10, 11). The search program has been modified to allow searches for relationships between bases, such as complementarity (12). Sequence manipulations and searches in Paris used data bases from the CITI-2 facility.Thermal Denaturation of RNA Hairpins. Experiments were conducted in 0.1 M NaCl/10 mM Na2HPO4/0.1 mM EDTA, pH 7.0. The absorbance was measured at 240 nm. Absorbance-temperature profiles were independent of RNA concentration between 1 and 20 ,uM, indicating that hairpin loops rather than dimers had formed. This was confirmed by showing that the RNA oligomers migrated as monomers on an Altex (Berkeley, CA) Spherogel-TSK exclusion column run at 25°C in the same buffer. The thermal denaturation experiments were run in 0.1 M NaCl buffer because the transition hyperchromic shift occurred at an indeterminably high temperature for the CUUCGG hairpin when the experiments were conducted in 1 M NaCl. The absorbance values were used to extrapolate upper and lower baselines from which were calculated the fraction of helix versus temperature as described by Uhlenbeck et al. (13). RESULTS The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. CTTCGG Predicts
Neutral nucleotide substitutions occur at varying rates along genomes, and it remains a major issue to unravel the mechanisms that cause these variations and to analyze their evolutionary consequences. Here, we study the role of replication in the neutral substitution pattern. We obtained a high-resolution replication timing profile of the whole human genome by massively parallel sequencing of nascent BrdU-labeled replicating DNA. These data were compared to the neutral substitution rates along the human genome, obtained by aligning human and chimpanzee genomes using macaque and orangutan as outgroups. All substitution rates increase monotonously with replication timing even after controlling for local or regional nucleotide composition, crossover rate, distance to telomeres, and chromatin compaction. The increase in non-CpG substitution rates might result from several mechanisms including the increase in mutationprone activities or the decrease in efficiency of DNA repair during the S phase. In contrast, the rate of C / T transitions in CpG dinucleotides increases in later-replicating regions due to increasing DNA methylation level that reflects a negative correlation between timing and gene expression. Similar results are observed in the mouse, which indicates that replication timing is a main factor affecting nucleotide substitution dynamics at non-CpG sites and constitutes a major neutral process driving mammalian genome evolution.
Rhizobium meliloti can interact symbiotically with Medicago plants, thereby inducing root nodules. However, certain Medicago plants can form nodules spontaneously, in the absence of rhizobia. A differential screening was performed using spontaneous nodule versus root cDNAs from Medicago sativa ssp. varia. Transcripts of a differentially expressed clone, Msenod40, were detected in all differentiating cells of nodule primordia and spontaneous nodules, but were absent in fully differentiated cells. Msenod40 showed homology to a soybean early nodulin gene, Gmenod40, although no significant open reading frame (ORF) or coding capacity was found in the Medicago sequence. Furthermore, in the sequences of cDNAs and a genomic clone (Mtenod40) isolated from Medicago truncatula, a species containing a unique copy of this gene, no ORFs were found either. In vitro translation of purified Mtenod40 transcripts did not reveal any protein product. Evaluation of the RNA secondary structure indicated that both msenod40 and Gmenod40 transcripts showed a high degree of stability, a property shared with known non‐coding RNAs. The Mtenod40 RNA was localized in the cytoplasm of cells in the nodule primordium. Infection with Agrobacterium tumefaciens strains bearing antisense constructs of Mtenod40 arrested callus growth of Medicago explants, while overexpressing Mtenod40 embryos developed into teratomas. These data suggest that the enod40 genes might have a role in plant development, acting as ‘riboregulators’, a novel class of untranslated RNAs associated with growth control and differentiation.
In higher eukaryotes, replication program specification in different cell types remains to be fully understood. We show for seven human cell lines that about half of the genome is divided in domains that display a characteristic U-shaped replication timing profile with early initiation zones at borders and late replication at centers. Significant overlap is observed between U-domains of different cell lines and also with germline replication domains exhibiting a N-shaped nucleotide compositional skew. From the demonstration that the average fork polarity is directly reflected by both the compositional skew and the derivative of the replication timing profile, we argue that the fact that this derivative displays a N-shape in U-domains sustains the existence of large-scale gradients of replication fork polarity in somatic and germline cells. Analysis of chromatin interaction (Hi-C) and chromatin marker data reveals that U-domains correspond to high-order chromatin structural units. We discuss possible models for replication origin activation within U/N-domains. The compartmentalization of the genome into replication U/N-domains provides new insights on the organization of the replication program in the human genome.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.