Through alternative processing of pre-mRNAs, individual mammalian genes often produce multiple mRNA and protein isoforms that may have related, distinct or even opposing functions. Here we report an in-depth analysis of 15 diverse human tissue and cell line transcriptomes based on deep sequencing of cDNA fragments, yielding a digital inventory of gene and mRNA isoform expression. Analysis of mappings of sequence reads to exon-exon junctions indicated that 92-94% of human genes undergo alternative splicing (AS), ∼86% with a minor isoform frequency of 15% or more. Differences in isoform-specific read densities indicated that a majority of AS and of alternative cleavage and polyadenylation (APA) events vary between tissues, while variation between individuals was ∼2- to 3-fold less common. Extreme or ‘switch-like’ regulation of splicing between tissues was associated with increased sequence conservation in regulatory regions and with generation of full-length open reading frames. Patterns of AS and APA were strongly correlated across tissues, suggesting coordinated regulation of these processes, and sequence conservation of a subset of known regulatory motifs in both alternative introns and 3′ UTRs suggested common involvement of specific factors in tissue-level regulation of both splicing and polyadenylation.
DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally employed long (400–800 bp) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intra-species genetic variation. We report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterise four million SNPs and four hundred thousand structural variants, many of which are previously unknown. Our approach is effective for accurate, rapid and economical whole genome re-sequencing and many other biomedical applications.
The paucity of enzymes that efficiently deconstruct plant polysaccharides represents a major bottleneck for industrial-scale conversion of cellulosic biomass into biofuels. Cow rumen microbes specialize in degradation of cellulosic plant material, but most members of this complex community resist cultivation. To characterize biomass-degrading genes and genomes, we sequenced and analyzed 268 gigabases of metagenomic DNA from microbes adherent to plant fiber incubated in cow rumen. From these data, we identified 27,755 putative carbohydrate-active genes and expressed 90 candidate proteins, of which 57% were enzymatically active against cellulosic substrates. We also assembled 15 uncultured microbial genomes, which were validated by complementary methods including single-cell genome sequencing. These data sets provide a substantially expanded catalog of genes and genomes participating in the deconstruction of cellulosic biomass.
In the last decade, genome-wide transcriptome analyses have been routinely used to monitor tissue-, disease- and cell type-specific gene expression, but it has been technically challenging to generate expression profiles from single cells. Here we describe a novel and robust mRNA-Seq protocol (Smart-Seq) that is applicable down to single cell levels. Compared with existing methods, Smart-Seq has improved read coverage across transcripts, which significantly enhances detailed analyses of alternative transcript isoforms and identification of SNPs. We have determined the sensitivity and quantitative accuracy of Smart-Seq for single-cell transcriptomics by evaluating it on total RNA dilution series. Applying Smart-Seq to circulating tumor cells from melanomas, we identified distinct gene expression patterns, including new candidate biomarkers for melanoma circulating tumor cells. Importantly, our protocol can easily be utilized for addressing fundamental biological problems requiring genome-wide transcriptome profiling in rare cells.
MicroRNAs (miRNAs) are important regulatory molecules in most eukaryotes and identification of their target mRNAs is essential for their functional analysis. Whereas conventional methods rely on computational prediction and subsequent experimental validation of target RNAs, we directly sequenced >28,000,000 signatures from the 5' ends of polyadenylated products of miRNA-mediated mRNA decay, isolated from inflorescence tissue of Arabidopsis thaliana, to discover novel miRNA-target RNA pairs. Within the set of approximately 27,000 transcripts included in the 8,000,000 nonredundant signatures, several previously predicted but nonvalidated targets of miRNAs were found. Like validated targets, most showed a single abundant signature at the miRNA cleavage site, particularly in libraries from a mutant deficient in the 5'-to-3' exonuclease AtXRN4. Although miRNAs in Arabidopsis have been extensively investigated, working in reverse from the cleaved targets resulted in the identification and validation of novel miRNAs. This versatile approach will affect the study of other aspects of RNA processing beyond miRNA-target RNA pairs.
MicroRNAs (miRNAs) are small regulatory RNAs that derive from distinctive hairpin transcripts. To learn more about the miRNAs of mammals, we sequenced 60 million small RNAs from mouse brain, ovary, testes, embryonic stem cells, three embryonic stages, and whole newborns. Analysis of these sequences confirmed 398 annotated miRNA genes and identified 108 novel miRNA genes. More than 150 previously annotated miRNAs and hundreds of candidates failed to yield sequenced RNAs with miRNA-like features. Ectopically expressing these previously proposed miRNA hairpins also did not yield small RNAs, whereas ectopically expressing the confirmed and newly identified hairpins usually did yield small RNAs with the classical miRNA features, including dependence on the Drosha endonuclease for processing. These experiments, which suggest that previous estimates of conserved mammalian miRNAs were inflated, provide a substantially revised list of confidently identified murine miRNAs from which to infer the general features of mammalian miRNAs. Our analyses also revealed new aspects of miRNA biogenesis and modification, including tissue-specific strand preferences, sequential Dicer cleavage of a metazoan precursor miRNA (pre-miRNA), consequential 59 heterogeneity, newly identified instances of miRNA editing, and evidence for widespread pre-miRNA uridylation reminiscent of miRNA regulation by Lin28.[Keywords: MicroRNA; miRNA biogenesis; noncoding RNA genes; high-throughput sequencing] Supplemental material is available at http://www.genesdev.org.
In metazoans, Piwi-related Argonaute proteins have been linked to germline maintenance, and to a class of germline-enriched small RNAs termed piRNAs. Here we show that an abundant class of 21 nucleotide small RNAs (21U-RNAs) are expressed in the C. elegans germline, interact with the C. elegans Piwi family member PRG-1, and depend on PRG-1 activity for their accumulation. The PRG-1 protein is expressed throughout development and localizes to nuage-like structures called P granules. Although 21U-RNA loci share a conserved upstream sequence motif, the mature 21U-RNAs are not conserved and, with few exceptions, fail to exhibit complementarity or evidence for direct regulation of other expressed sequences. Our findings demonstrate that 21U-RNAs are the piRNAs of C. elegans and link this class of small RNAs and their associated Piwi Argonaute to the maintenance of temperature-dependent fertility.
We describe a novel sequencing approach that combines non-gel-based signature sequencing with in vitro cloning of millions of templates on separate 5 microm diameter microbeads. After constructing a microbead library of DNA templates by in vitro cloning, we assembled a planar array of a million template-containing microbeads in a flow cell at a density greater than 3x10(6) microbeads/cm2. Sequences of the free ends of the cloned templates on each microbead were then simultaneously analyzed using a fluorescence-based signature sequencing method that does not require DNA fragment separation. Signature sequences of 16-20 bases were obtained by repeated cycles of enzymatic cleavage with a type IIs restriction endonuclease, adaptor ligation, and sequence interrogation by encoded hybridization probes. The approach was validated by sequencing over 269,000 signatures from two cDNA libraries constructed from a fully sequenced strain of Saccharomyces cerevisiae, and by measuring gene expression levels in the human cell line THP-1. The approach provides an unprecedented depth of analysis permitting application of powerful statistical techniques for discovery of functional relationships among genes, whether known or unknown beforehand, or whether expressed at high or very low levels.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.