Blood cells derive from hematopoietic stem cells through stepwise fating events. To characterize gene expression programs driving lineage choice we sequenced RNA from eight primary human hematopoietic progenitor populations representing the major myeloid commitment stages and the main lymphoid stage. We identify extensive cell-type specific expression changes: 6,711 genes and 10,724 transcripts, enriched in non-protein coding elements at early stages of differentiation. In addition, we discovered 7,881 novel splice junctions and 2,301 differentially used alternative splicing events, enriched in genes involved in regulatory processes. We demonstrate experimentally cell specific isoform usage, identifying NFIB as a regulator of megakaryocyte maturation -the platelet precursor. Our data highlight the complexity of fating events in closely related progenitor populations, the understanding of which is essential for the advancement of transplantation and regenerative medicine.
Chronic lymphocytic leukemia (CLL) is a frequent hematological neoplasm in which underlying epigenetic alterations are only partially understood. Here, we analyze the reference epigenome of seven primary CLLs and the regulatory chromatin landscape of 107 primary cases in the context of normal B cell differentiation. We identify that the CLL chromatin landscape is largely influenced by distinct dynamics during normal B cell maturation. Beyond this, we define extensive catalogues of regulatory elements de novo reprogrammed in CLL as a whole and in its major clinico-biological subtypes classified by IGHV somatic hypermutation levels. We uncover that IGHV-unmutated CLLs harbor more active and open chromatin than IGHV-mutated cases. Furthermore, we show that de novo active regions in CLL are enriched for NFAT, FOX and TCF/LEF transcription factor family binding sites. Although most genetic alterations are not associated with consistent epigenetic profiles, CLLs with MYD88 mutations and trisomy 12 show distinct chromatin configurations. Furthermore, we observe that non-coding mutations in IGHV-mutated CLLs are enriched in H3K27ac-associated regulatory elements outside accessible chromatin. Overall, this study provides an integrative portrait of the CLL epigenome, identifies extensive networks of altered regulatory elements and sheds light on the relationship between the genetic and epigenetic architecture of the disease.
The subcellular localization of long noncoding RNAs (lncRNAs) holds valuable clues to their molecular function. However, measuring localization of newly discovered lncRNAs involves time-consuming and costly experimental methods. We have created "lncATLAS," a comprehensive resource of lncRNA localization in human cells based on RNA-sequencing data sets. Altogether, 6768 GENCODE-annotated lncRNAs are represented across various compartments of 15 cell lines. We introduce relative concentration index (RCI) as a useful measure of localization derived from ensemble RNA-seq measurements. LncATLAS is accessible through an intuitive and informative webserver, from which lncRNAs of interest are accessed using identifiers or names. Localization is presented across cell types and organelles, and may be compared to the distribution of all other genes. Publication-quality figures and raw data tables are automatically generated with each query, and the entire data set is also available to download. LncATLAS makes lncRNA subcellular localization data available to the widest possible number of researchers. It is available at lncatlas.crg.eu.
We present ggsashimi, a command-line tool for the visualization of splicing events across multiple samples. Given a specified genomic region, ggsashimi creates sashimi plots for individual RNA-seq experiments as well as aggregated plots for groups of experiments, a feature unique to this software. Compared to the existing versions of programs generating sashimi plots, it uses popular bioinformatics file formats, it is annotation-independent, and allows the visualization of splicing events even for large genomic regions by scaling down the genomic segments between splice sites. ggsashimi is freely available at https://github.com/guigolab/ggsashimi. It is implemented in python, and internally generates R code for plotting.
Long noncoding RNAs (lncRNAs) represent a vast unexplored genetic space that may hold missing drivers of tumourigenesis, but few such “driver lncRNAs” are known. Until now, they have been discovered through changes in expression, leading to problems in distinguishing between causative roles and passenger effects. We here present a different approach for driver lncRNA discovery using mutational patterns in tumour DNA. Our pipeline, ExInAtor, identifies genes with excess load of somatic single nucleotide variants (SNVs) across panels of tumour genomes. Heterogeneity in mutational signatures between cancer types and individuals is accounted for using a simple local trinucleotide background model, which yields high precision and low computational demands. We use ExInAtor to predict drivers from the GENCODE annotation across 1112 entire genomes from 23 cancer types. Using a stratified approach, we identify 15 high-confidence candidates: 9 novel and 6 known cancer-related genes, including MALAT1, NEAT1 and SAMMSON. Both known and novel driver lncRNAs are distinguished by elevated gene length, evolutionary conservation and expression. We have presented a first catalogue of mutated lncRNA genes driving cancer, which will grow and improve with the application of ExInAtor to future tumour genome projects.
CRISPR-Cas9 technology can be used to engineer precise genomic deletions with pairs of single guide RNAs (sgRNAs). This approach has been widely adopted for diverse applications, from disease modelling of individual loci, to parallelized loss-of-function screens of thousands of regulatory elements. However, no solution has been presented for the unique bioinformatic design requirements of CRISPR deletion. We here present CRISPETa, a pipeline for flexible and scalable paired sgRNA design based on an empirical scoring model. Multiple sgRNA pairs are returned for each target, and any number of targets can be analyzed in parallel, making CRISPETa equally useful for focussed or high-throughput studies. Fast run-times are achieved using a pre-computed off-target database. sgRNA pair designs are output in a convenient format for visualisation and oligonucleotide ordering. We present pre-designed, high-coverage library designs for entire classes of protein-coding and non-coding elements in human, mouse, zebrafish, Drosophila melanogaster and Caenorhabditis elegans. In human cells, we reproducibly observe deletion efficiencies of ≥50% for CRISPETa designs targeting an enhancer and exonic fragment of the MALAT1 oncogene. In the latter case, deletion results in production of desired, truncated RNA. CRISPETa will be useful for researchers seeking to harness CRISPR for targeted genomic deletion, in a variety of model organisms, from single-target to high-throughput scales.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.