To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation.
Animal transcriptomes are dynamic, each cell type, tissue and organ system expressing an ensemble of transcript isoforms that give rise to substantial diversity. We identified new genes, transcripts, and proteins using poly(A)+ RNA sequence from Drosophila melanogaster cultured cell lines, dissected organ systems, and environmental perturbations. We found a small set of mostly neural-specific genes has the potential to encode thousands of transcripts each through extensive alternative promoter usage and RNA splicing. The magnitudes of splicing changes are larger between tissues than between developmental stages, and most sex-specific splicing is gonad-specific. Gonads express hundreds of previously unknown coding and long noncoding RNAs (lncRNAs) some of which are antisense to protein-coding genes and produce short regulatory RNAs. Furthermore, previously identified pervasive intergenic transcription occurs primarily within newly identified introns. The fly transcriptome is substantially more complex than previously recognized arising from combinatorial usage of promoters, splice sites, and polyadenylation sites.
Here we characterize the expression of the full system of genes which control the segmentation morphogenetic field of Drosophila at the protein level in one dimension. The data used for this characterization are quantitative with cellular resolution in space and about 6 min in time. We present the full quantitative profiles of all 14 segmentation genes which act before the onset of gastrulation. The expression patterns of these genes are first characterized in terms of their average or typical behavior. At this level, the expression of all of the genes has been integrated into a single atlas of gene expression in which the expression levels of all genes in each cell are specified. We show that expression domains do not arise synchronously, but rather each domain has its own specific dynamics of formation. Moreover, we show that the expression domains shift position in the direction of the cephalic furrow, such that domains in the anlage of the segmented germ band shift anteriorly while those in the presumptive head shift posteriorly. The expression atlas of integrated data is very close to the expression profiles of individual embryos during the latter part of the blastoderm stage. At earlier times gap gene domains show considerable variation in amplitude, and significant positional variability. Nevertheless, an average early gap domain is close to that of a median individual. In contrast, we show that there is a diversity of developmental trajectories among pair-rule genes at a variety of levels, including the order of domain formation and positional accuracy. We further show that this variation is dynamically reduced, or canalized, over time. As the first quantitatively characterized morphogenetic field, this system and its behavior constitute an extraordinarily rich set of materials for the study of canalization and embryonic regulation at the molecular level.
Drosophila melanogaster cell lines are important resources for cell biologists. Here, we catalog the expression of exons, genes, and unannotated transcriptional signals for 25 lines. Unannotated transcription is substantial (typically 19% of euchromatic signal). Conservatively, we identify 1405 novel transcribed regions; 684 of these appear to be new exons of neighboring, often distant, genes. Sixty-four percent of genes are expressed detectably in at least one line, but only 21% are detected in all lines. Each cell line expresses, on average, 5885 genes, including a common set of 3109. Expression levels vary over several orders of magnitude. Major signaling pathways are well represented: most differentiation pathways are “off” and survival/growth pathways “on.” Roughly 50% of the genes expressed by each line are not part of the common set, and these show considerable individuality. Thirty-one percent are expressed at a higher level in at least one cell line than in any single developmental stage, suggesting that each line is enriched for genes characteristic of small sets of cells. Most remarkable is that imaginal disc-derived lines can generally be assigned, on the basis of expression, to small territories within developing discs. These mappings reveal unexpected stability of even fine-grained spatial determination. No two cell lines show identical transcription factor expression. We conclude that each line has retained features of an individual founder cell superimposed on a common “cell line“ gene expression pattern.
Since the initial annotation of miRNAs from cloned short RNAs by the Ambros, Tuschl, and Bartel groups in 2001, more than a hundred studies have sought to identify additional miRNAs in various species. We report here a meta-analysis of short RNA data from Drosophila melanogaster, aggregating published libraries with 76 data sets that we generated for the modENCODE project. In total, we began with more than 1 billion raw reads from 187 libraries comprising diverse developmental stages, specific tissue- and cell-types, mutant conditions, and/or Argonaute immunoprecipitations. We elucidated several features of known miRNA loci, including multiple phased byproducts of cropping and dicing, abundant alternative 5′ termini of certain miRNAs, frequent 3′ untemplated additions, and potential editing events. We also identified 49 novel genomic locations of miRNA production, and 61 additional candidate loci with limited evidence for miRNA biogenesis. Although these loci broaden the Drosophila miRNA catalog, this work supports the notion that a restricted set of cellular transcripts is competent to be specifically processed by the Drosha/Dicer-1 pathway. Unexpectedly, we detected miRNA production from coding and untranslated regions of mRNAs and found the phenomenon of miRNA production from the antisense strand of known loci to be common. Altogether, this study lays a comprehensive foundation for the study of miRNA diversity and evolution in a complex animal model.
SummaryElucidation of interactions involving DNA and histone post-translational-modifications (PTMs) is essential for providing insights into complex biological functions. Reader assemblies connected by flexible linkages facilitate avidity and increase affinity; however, little is known about the contribution to the recognition process of multiple PTMs because of rigidity in the absence of conformational flexibility. Here, we resolve the crystal structure of the triple reader module (PHD-BRD-PWWP) of ZMYND8, which forms a stable unit capable of simultaneously recognizing multiple histone PTMs while presenting a charged platform for association with DNA. Single domain disruptions destroy the functional network of interactions initiated by ZMYND8, impairing recruitment to sites of DNA damage. Our data establish a proof of principle that rigidity can be compensated by concomitant DNA and histone PTM interactions, maintaining multivalent engagement of transient chromatin states. Thus, our findings demonstrate an important role for rigid multivalent reader modules in nucleosome binding and chromatin function.
A major objective of systems biology is to organize molecular interactions as networks and to characterize information-flow within networks. We describe a computational framework to integrate protein-protein interaction (PPI) networks and genetic screens to predict the “signs” of interactions (i.e. activation/inhibition relationships). We constructed a Drosophila melanogaster signed PPI network, consisting of 6,125 signed PPIs connecting 3,352 proteins that can be used to identify positive and negative regulators of signaling pathways and protein complexes. We identified an unexpected role for the metabolic enzymes Enolase and Aldo-keto reductase as positive and negative regulators of proteolysis, respectively. Characterization of the activation/inhibition relationships between physically interacting proteins within signaling pathways will impact our understanding of many biological functions, including signal transduction and mechanisms of disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.