Constructing an atlas of cell types in complex organisms will require a collective effort to characterize billions of individual cells. Single cell RNA sequencing (scRNA-seq) has emerged as the main tool for characterizing cellular diversity, but current methods use custom microfluidics or microwells to compartmentalize single cells, limiting scalability and widespread adoption. Here we present Split Pool Ligation-based Transcriptome sequencing (SPLiT-seq), a scRNA-seq method that labels the cellular origin of RNA through combinatorial indexing. SPLiT-seq is compatible with fixed cells, scales exponentially, uses only basic laboratory equipment, and costs one cent per cell. We used this approach to analyze 109,069 single cell transcriptomes from an entire postnatal day 5 mouse brain, providing the first global snapshot at this stage of development. We identified 13 main populations comprising different types of neurons, glia, immune cells, endothelia, as well as types in the blood-brain-barrier. Moreover, we resolve substructure within these clusters corresponding to cells at different stages of development. As sequencing capacity increases, SPLiT-seq will enable profiling of billions of cells in a single experiment.Over three hundred years have passed since the discovery of the cell, yet we still do not have a complete catalogue of cell types or their functions. While transcriptomic profiling of individual cells has emerged as a promising solution to characterizing cellular diversity (1, 2), increases in throughput are needed before a complete "atlas" of cell types can be generated. Recent single cell RNA-seq (scRNA-seq) methods have profiled tens of thousands of individual cells (3-6), revealing new insights about the immune system (7) and identifying new cell types in the brain (8-11). However, since these methods require cell sorters and custom microfluidics or microwells, throughput is still limited, experiments are costly, and access is limited to a small number of labs.peer-reviewed)
Alternative polyadenylation (APA) is a major driver of transcriptome diversity in human 10 cells. Here, we use deep learning to predict APA from DNA sequence alone. We trained 11 our model (APARENT, APA REgression NeT) on isoform expression data from over three 12 million APA reporters, built by inserting random sequence into twelve distinct 3'UTR 13 contexts. Predictions are highly accurate across both synthetic and genomic contexts; 14 when tasked with inferring APA in human 3'UTRs, APARENT outperforms models trained 15 exclusively on endogenous data. Visualizing features learned across all network layers 16 reveals that APARENT recognizes sequence motifs known to recruit APA regulators, 17 discovers previously unknown sequence determinants of cleavage site selection, and 18integrates these features into a comprehensive, interpretable cis-regulatory code. Finally, 19we use APARENT to quantify the impact of genetic variants on APA. Our approach 20 detects pathogenic variants in a wide range of disease contexts, expanding our 21
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.