Topoisomerase I (Top1) is a key enzyme in functioning at the interface between DNA replication, transcription and mRNA maturation. Here, we show that Top1 suppresses genomic instability in mammalian cells by preventing a conflict between transcription and DNA replication. Using DNA combing and ChIP (chromatin immunoprecipitation)-on-chip, we found that Top1-deficient cells accumulate stalled replication forks and chromosome breaks in S phase, and that breaks occur preferentially at gene-rich regions of the genome. Notably, these phenotypes were suppressed by preventing the formation of RNA-DNA hybrids (R-loops) during transcription. Moreover, these defects could be mimicked by depletion of the splicing factor ASF/SF2 (alternative splicing factor/splicing factor 2), which interacts functionally with Top1. Taken together, these data indicate that Top1 prevents replication fork collapse by suppressing the formation of R-loops in an ASF/SF2-dependent manner. We propose that interference between replication and transcription represents a major source of spontaneous replication stress, which could drive genomic instability during the early stages of tumorigenesis.
Intron retention (IR) occurs when an intron is transcribed into pre-mRNA and remains in the final mRNA. We have developed a program and database called IRFinder to accurately detect IR from mRNA sequencing data. Analysis of 2573 samples showed that IR occurs in all tissues analyzed, affects over 80% of all coding genes and is associated with cell differentiation and the cell cycle. Frequently retained introns are enriched for specific RNA binding protein sites and are often retained in clusters in the same gene. IR is associated with lower protein levels and intron-retaining transcripts that escape nonsense-mediated decay are not actively translated.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-017-1184-4) contains supplementary material, which is available to authorized users.
Maintenance of genome integrity relies on surveillance mechanisms that detect and signal arrested replication forks. Although evidence from budding yeast indicates that the DNA replication checkpoint (DRC) is primarily activated by single-stranded DNA (ssDNA), studies in higher eukaryotes have implicated primer ends in this process. To identify factors that signal primed ssDNA in Saccharomyces cerevisiae, we have screened a collection of checkpoint mutants for their ability to activate the DRC, using the repression of late origins as readout for checkpoint activity. This quantitative analysis reveals that neither RFC(Rad24) and the 9-1-1 clamp nor the alternative clamp loader RFC(Elg1) is required to signal paused forks. In contrast, we found that RFC(Ctf18) is essential for the Mrc1-dependent activation of Rad53 and for the maintenance of paused forks. These data identify RFC(Ctf18) as a key DRC mediator, potentially bridging Mrc1 and primed ssDNA to signal paused forks.
Polycomb group proteins form two main complexes, PRC2 and PRC1, which generally coregulate their target genes. Here, we show that PRC1 components act as neoplastic tumor suppressors independently of PRC2 function. By mapping the distribution of PRC1 components and the histone H3K27me3 mark, we identify a large set of genes that acquire PRC1 in the absence of H3K27me3 in Drosophila larval tissues. These genes massively outnumber canonical targets and they are preeminently involved in the regulation of cell proliferation, signaling and polarity. Mutation in PRC1 components specifically deregulates this set of genes, whereas canonical targets are derepressed in both PRC1 and PRC2 mutants. In human ES cells, PRC1 components colocalize with H3K27me3 like in Drosophila embryos, whereas in differentiated cell types they are selectively recruited to a large set of proliferation and signaling-associated genes that are H3K27me3 negative, showing that the redeployment of PRC1 components during development is evolutionarily conserved.Polycomb group (PcG) proteins form two main classes of evolutionarily conserved complexes called PRC2 and PRC1. In Drosophila, PRC2 contains the enzymatic E(Z) subunit that deposits the H3K27me3 mark, along with SU(Z)12, ESC and p55 1-4. A second class of PcG complex, PRC1, was shown to contain PH, PC, PSC and the SCE subunits 5. These two complexes are recruited to their target sites by a set of DNA binding proteins, notably PHO 6. They colocalize almost perfectly in embryogenesis7, and their Correspondence to: Anne-Marie.Martinez@igh.cnrs.fr and Giacomo.Cavalli@igh.cnrs.fr. Accession codes Drosophila ChIP-Seq and RNA-Seq data are deposited to the Gene Expression Omnibus (GEO) repository under the accession number GSE74080.
Metazoan genomes are partitioned into modular chromosomal domains containing active or repressive chromatin. In flies, Polycomb group (PcG) response elements (PREs) recruit PHO and other DNA-binding factors and act as nucleation sites for the formation of Polycomb repressive domains. The sequence specificity of PREs is not well understood. Here, we use comparative epigenomics and transgenic assays to show that Drosophila domain organization and PRE specification are evolutionarily conserved despite significant cis-element divergence within Polycomb domains, whereas cis-element evolution is strongly correlated with transcription factor binding divergence outside of Polycomb domains. Cooperative interactions of PcG complexes and their recruiting factor PHO stabilize PHO recruitment to low-specificity sequences. Consistently, PHO recruitment to sites within Polycomb domains is stabilized by PRC1. These data suggest that cooperative rather than hierarchical interactions among low-affinity sequences, DNA-binding factors, and the Polycomb machinery are giving rise to specific and strongly conserved 3D structures in Drosophila.
Polycomb group (PcG) proteins dynamically define cellular identities through epigenetic repression of key developmental genes. PcG target gene repression can be stabilized through the interaction in the nucleus at PcG foci. Here, we report the results of a high-resolution microscopy genome-wide RNAi screen that identifies 129 genes that regulate the nuclear organization of Pc foci. Candidate genes include PcG components and chromatin factors, as well as many protein-modifying enzymes, including components of the SUMOylation pathway. In the absence of SUMO, Pc foci coagulate into larger aggregates. Conversely, loss of function of the SUMO peptidase Velo disperses Pc foci. Moreover, SUMO and Velo colocalize with PcG proteins at PREs, and Pc SUMOylation affects its chromatin targeting, suggesting that the dynamic regulation of Pc SUMOylation regulates PcG-mediated silencing by modulating the kinetics of Pc binding to chromatin as well as its ability to form Polycomb foci.
Comparative analysis of high throughput sequencing data between multiple conditions often involves mapping of sequencing reads to a reference and downstream bioinformatics analyses. Both of these steps may introduce heavy bias and potential data loss. This is especially true in studies where patient transcriptomes or genomes may vary from their references, such as in cancer. Here we describe a novel approach and associated software that makes use of advances in genetic algorithms and feature selection to comprehensively explore massive volumes of sequencing data to classify and discover new sequences of interest without a mapping step and without intensive use of specialized bioinformatics pipelines. We demonstrate that our approach called GECKO for GEnetic Classification using k-mer Optimization is effective at classifying and extracting meaningful sequences from multiple types of sequencing approaches including mRNA, microRNA, and DNA methylome data.
Motivation Long-read sequencing technologies are invaluable for determining complex RNA transcript architectures but are error-prone. Numerous “hybrid correction” algorithms have been developed for genomic data that correct long reads by exploiting the accuracy and depth of short reads sequenced from the same sample. These algorithms are not suited for correcting more complex transcriptome sequencing data. Results We have created a novel reference-free algorithm called TALC (Transcript-level Aware Long Read Correction) which models changes in RNA expression and isoform representation in a weighted De-Bruijn graph to correct long reads from transcriptome studies. We show that transcript-level aware correction by TALC improves the accuracy of the whole spectrum of downstream RNA-seq applications and is thus necessary for transcriptome analyses that use long read technology. Availability TALC is implemented in C ++ and available at https://github.com/lbroseus/TALC. Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.