Animal transcriptomes are dynamic, each cell type, tissue and organ system expressing an ensemble of transcript isoforms that give rise to substantial diversity. We identified new genes, transcripts, and proteins using poly(A)+ RNA sequence from Drosophila melanogaster cultured cell lines, dissected organ systems, and environmental perturbations. We found a small set of mostly neural-specific genes has the potential to encode thousands of transcripts each through extensive alternative promoter usage and RNA splicing. The magnitudes of splicing changes are larger between tissues than between developmental stages, and most sex-specific splicing is gonad-specific. Gonads express hundreds of previously unknown coding and long noncoding RNAs (lncRNAs) some of which are antisense to protein-coding genes and produce short regulatory RNAs. Furthermore, previously identified pervasive intergenic transcription occurs primarily within newly identified introns. The fly transcriptome is substantially more complex than previously recognized arising from combinatorial usage of promoters, splice sites, and polyadenylation sites.
Black phosphorus consists of stacked layers of phosphorene, a two-dimensional semiconductor with promising device characteristics. We report the realization of a widely tunable band gap in few-layer black phosphorus doped with potassium using an in situ surface doping technique. Through band structure measurements and calculations, we demonstrate that a vertical electric field from dopants modulates the band gap, owing to the giant Stark effect, and tunes the material from a moderate-gap semiconductor to a band-inverted semimetal. At the critical field of this band inversion, the material becomes a Dirac semimetal with anisotropic dispersion, linear in armchair and quadratic in zigzag directions. The tunable band structure of black phosphorus may allow great flexibility in design and optimization of electronic and optoelectronic devices.
Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads.
Genome sequences for most metazoans and plants are incomplete because of the presence of repeated DNA in the heterochromatin. The heterochromatic regions of Drosophila melanogaster contain 20 million bases (Mb) of sequence amenable to mapping, sequence assembly, and finishing. We describe the generation of 15 Mb of finished or improved heterochromatic sequence with the use of available clone resources and assembly methods. We also constructed a bacterial artificial chromosome-based physical map that spans 13 Mb of the pericentromeric heterochromatin and a cytogenetic map that positions 11 Mb in specific chromosomal locations. We have approached a complete assembly and mapping of the nonsatellite component of Drosophila heterochromatin. The strategy we describe is also applicable to generating substantially more information about heterochromatin in other species, including humans.Heterochromatin is a major component of metazoan and plant genomes (e.g., ~20% of the human genome) that regulates chromosome segregation, nuclear organization, and gene expression (1-4). A thorough description of the sequence and organization of heterochromatin is necessary for understanding the essential functions encoded within this region of the genome. However, difficulties in cloning, mapping, and assembling regions rich in repetitive elements have hindered the genomic analysis of heterochromatin (5-7). The fruit fly Drosophila melanogaster is a model for heterochromatin studies. About one-third of the genome is considered heterochromatic and is concentrated in the pericentromeric and telomeric regions of the chromosomes (X, 2, 3, 4, and Y) (5,8). The heterochromatin contains tandemly repeated simple sequences (including satellite DNAs) (9), middle repetitive elements [such as transposable elements (TEs) and ribosomal DNA], and some single-copy DNA (10).
To develop a catalog of regulatory sites in two major model organisms, and, the modERN (model organism Encyclopedia of Regulatory Networks) consortium has systematically assayed the binding sites of transcription factors (TFs). Combined with data produced by our predecessor, modENCODE (Model Organism ENCyclopedia Of DNA Elements), we now have data for 262 TFs identifying 1.23 M sites in the fly genome and 217 TFs identifying 0.67 M sites in the worm genome. Because sites from different TFs are often overlapping and tightly clustered, they fall into 91,011 and 59,150 regions in the fly and worm, respectively, and these binding sites span as little as 8.7 and 5.8 Mb in the two organisms. Clusters with large numbers of sites (so-called high occupancy target, or HOT regions) predominantly associate with broadly expressed genes, whereas clusters containing sites from just a few factors are associated with genes expressed in tissue-specific patterns. All of the strains expressing GFP-tagged TFs are available at the stock centers, and the chromatin immunoprecipitation sequencing data are available through the ENCODE Data Coordinating Center and also through a simple interface (http://epic.gs.washington.edu/modERN/) that facilitates rapid accessibility of processed data sets. These data will facilitate a vast number of scientific inquiries into the function of individual TFs in key developmental, metabolic, and defense and homeostatic regulatory pathways, as well as provide a broader perspective on how individual TFs work together in local networks and globally across the life spans of these two key model organisms.
Accurate gene model annotation of reference genomes is critical for making them useful. The modENCODE project has improved the D. melanogaster genome annotation by using deep and diverse high-throughput data. Since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function, we have performed large-scale interspecific comparisons to increase confidence in predicted annotations. To support comparative genomics, we filled in divergence gaps in the Drosophila phylogeny by generating draft genomes for eight new species. For comparative transcriptome analysis, we generated mRNA expression profiles on 81 samples from multiple tissues and developmental stages of 15 Drosophila species, and we performed cap analysis of gene expression in D. melanogaster and D. pseudoobscura. We also describe conservation of four distinct core promoter structures composed of combinations of elements at three positions. Overall, each type of genomic feature shows a characteristic divergence rate relative to neutral models, highlighting the value of multispecies alignment in annotating a target genome that should prove useful in the annotation of other high priority genomes, especially human and other mammalian genomes that are rich in noncoding sequences. We report that the vast majority of elements in the annotation are evolutionarily conserved, indicating that the annotation will be an important springboard for functional genetic testing by the Drosophila community.
CHARMM-GUI, http://www.charmm-gui.org, is a web-based graphical user interface that prepares complex biomolecular systems for molecular simulations. CHARMM-GUI creates input files for a number of programs including CHARMM, NAMD, GROMACS, AMBER, GENESIS, LAMMPS, Desmond, OpenMM, and CHARMM/OpenMM. Since its original development in 2006, CHARMM-GUI has been widely adopted for various purposes and now contains a number of different modules designed to set up a broad range of simulations: (1) PDB Reader & Manipulator, Glycan Reader, and Ligand Reader & Modeler for reading and modifying molecules; (2) Quick MD Simulator, Membrane Builder, Nanodisc Builder, HMMM Builder, Monolayer Builder, Micelle Builder, and Hex Phase Builder for building all-atom simulation systems in various environments; (3) PACE CG Builder and Martini Maker for building coarse-grained simulation systems; (4) DEER Facilitator and MDFF/xMDFF Utilizer for experimentally guided simulations; (5) Implicit Solvent Modeler, PBEQ-Solver, and GCMC/BD Ion Simulator for implicit solvent related calculations; (6) Ligand Binder for ligand solvation and binding free energy simulations; and (7) Drude Prepper for preparation of simulations with the CHARMM Drude polarizable force field. Recently, new modules have been integrated into CHARMM-GUI, such as Glycolipid Modeler for generation of various glycolipid structures, and LPS Modeler for generation of lipopolysaccharide structures from various Gram-negative bacteria. These new features together with existing modules are expected to facilitate advanced molecular modeling and simulation thereby leading to an improved understanding of the molecular details of the structure and dynamics of complex biomolecular systems. Here, we briefly review these capabilities and discuss potential future directions in the CHARMM-GUI development project.
The availability of sequenced genomes from 12 Drosophila species has enabled the use of comparative genomics for the systematic discovery of functional elements conserved within this genus. We have developed quantitative metrics for the evolutionary signatures specific to protein-coding regions and applied them genome-wide, resulting in 1193 candidate new protein-coding exons in the D. melanogaster genome. We have reviewed these predictions by manual curation and validated a subset by directed cDNA screening and sequencing, revealing both new genes and new alternative splice forms of known genes. We also used these evolutionary signatures to evaluate existing gene annotations, resulting in the validation of 87% of genes lacking descriptive names and identifying 414 poorly conserved genes that are likely to be spurious predictions, noncoding, or species-specific genes. Furthermore, our methods suggest a variety of refinements to hundreds of existing gene models, such as modifications to translation start codons and exon splice boundaries. Finally, we performed directed genome-wide searches for unusual protein-coding structures, discovering 149 possible examples of stop codon readthrough, 125 new candidate ORFs of polycistronic mRNAs, and several candidate translational frameshifts. These results affect >10% of annotated fly genes and demonstrate the power of comparative genomics to enhance our understanding of genome organization, even in a model organism as intensively studied as Drosophila melanogaster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.