SUMMARY Methylation of the N6 position of adenosine (m6A) is a post-transcriptional modification of RNA whose prevalence and physiological relevance is poorly understood. The recent discovery that FTO, an obesity risk gene, encodes an m6A demethylase implicates m6A as an important regulator of physiological processes. Here we present a method for transcriptome-wide m6A localization, which combines m6A-specific methylated RNA immunoprecipitation with next-generation sequencing (MeRIP-Seq). We use this method to identify mRNAs of 7,676 mammalian genes that contain m6A, indicating that m6A is a common base modification of mRNA. The m6A modification exhibits tissue-specific regulation and is markedly increased throughout brain development. We find that m6A sites are enriched near stop codons and in 3' UTRs, and we uncover an association between m6A residues and microRNA binding sites within 3' UTRs. These findings provide a resource for identifying transcripts that are substrates for adenosine methylation and reveal insights into the epigenetic regulation of the mammalian transcriptome.
Protein coding genes constitute only approximately 1% of the human genome but harbor 85% of the mutations with large effects on disease-related traits. Therefore, efficient strategies for selectively sequencing complete coding regions (i.e., ''whole exome'') have the potential to contribute to the understanding of rare and common human diseases. Here we report a method for whole-exome sequencing coupling Roche/NimbleGen whole exome arrays to the Illumina DNA sequencing platform. We demonstrate the ability to capture approximately 95% of the targeted coding sequences with high sensitivity and specificity for detection of homozygous and heterozygous variants. We illustrate the utility of this approach by making an unanticipated genetic diagnosis of congenital chloride diarrhea in a patient referred with a suspected diagnosis of Bartter syndrome, a renal salt-wasting disease. The molecular diagnosis was based on the finding of a homozygous missense D652N mutation at a position in SLC26A3 (the known congenital chloride diarrhea locus) that is virtually completely conserved in orthologues and paralogues from invertebrates to humans, and clinical follow-up confirmed the diagnosis. To our knowledge, whole-exome (or genome) sequencing has not previously been used to make a genetic diagnosis. Five additional patients suspected to have Bartter syndrome but who did not have mutations in known genes for this disease had homozygous deleterious mutations in SLC26A3. These results demonstrate the clinical utility of whole-exome sequencing and have implications for disease gene discovery and clinical diagnosis.Bartter syndrome ͉ congenital chloride diarrhea ͉ next-generation sequencing ͉ whole-exome sequencing ͉ personal genomes G enetic variation plays a major role in both Mendelian and non-Mendelian diseases. Among the approximately 2,600 Mendelian diseases that have been solved, the overwhelming majority are caused by rare mutations that affect the function of individual proteins; at individual Mendelian loci, approximately 85% of the disease-causing mutations can typically be found in the coding region or in canonical splice sites (1). For complex traits, genome-wide association studies have identified more than 250 common variants associated with risk alleles that contribute to a wide range of diseases (2, 3). To date, most of these impart small effects on disease risk (e.g., odds ratio of 1.2); moreover, even when extremely large studies have been performed, the vast majority of the genetic contribution to disease risk remain unexplained (4-6). These findings suggest that individually rare variants with relatively large effect may account for a large fraction of this missing trait variance. Indeed, studies addressing this question have documented the presence of individually rare variants with relatively large effect (7,8). Consistent with the Mendelian model, coding variants have proven to be prevalent sources of such rare variants.These considerations motivate implementation of robust approaches to sequencing complete c...
A large number of computational methods have been developed for analyzing differential gene expression in RNA-seq data. We describe a comprehensive evaluation of common methods using the SEQC benchmark dataset and ENCODE data. We consider a number of key features, including normalization, accuracy of differential expression detection and differential expression analysis when one condition has no detectable expression. We find significant differences among the methods, but note that array-based methods adapted to RNA-seq data perform comparably to methods designed for RNA-seq. Our results demonstrate that increasing the number of replicate samples significantly improves detection power over increased sequencing depth.
We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the United States Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed, for these and qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.
Comparative studies of gene regulation suggest an important role for natural selection in shaping gene expression patterns within and between species. Most of these studies, however, estimated gene expression levels using microarray probes designed to hybridize to only a small proportion of each gene. Here, we used recently developed RNA sequencing protocols, which sidestep this limitation, to assess intra-and interspecies variation in gene regulatory processes in considerably more detail than was previously possible. Specifically, we used RNA-seq to study transcript levels in humans, chimpanzees, and rhesus macaques, using liver RNA samples from three males and three females from each species. Our approach allowed us to identify a large number of genes whose expression levels likely evolve under natural selection in primates. These include a subset of genes with conserved sexually dimorphic expression patterns across the three species, which we found to be enriched for genes involved in lipid metabolism. Our data also suggest that while alternative splicing is tightly regulated within and between species, sex-specific and lineage-specific changes in the expression of different splice forms are also frequent. Intriguingly, among genes in which a change in exon usage occurred exclusively in the human lineage, we found an enrichment of genes involved in anatomical structure and morphogenesis, raising the possibility that differences in the regulation of alternative splicing have been an important force in human evolution.[Supplemental material is available online at http://www.genome.org. The RNA-seq data have been submitted to the NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/) under series accession no. GSE17274.]Changes in gene regulation are thought to play an important role in adaptive evolution and speciation (Britten and Davidson 1971;King and Wilson 1975;Jin et al. 2001;Carroll 2003Carroll , 2008Abzhanov et al. 2004;Iftikhar et al. 2004;Shapiro et al. 2004;Taron et al. 2004;Wray 2007). In support of this notion, comparative genome-wide studies of gene regulation within and between populations and species have revealed evidence consistent with the action of both stabilizing as well as directional selection on gene expression levels (Oleksiak et al. 2002;Lemos et al. 2005;Rifkin et al. 2005;Gilad et al. 2006;Whitehead and Crawford 2006). Most of these studies, however, focused on estimates of overall gene expression levels, probably because prior to the development of next-generation sequencing, it was very challenging to characterize expression level variation of individual exons on a genome-wide scale.Indeed, previous studies of alternative splicing patterns in mammalian species focused on relatively small numbers of exons and genes. For example, Su et al. (2008) studied variation in exon usage and alternative splicing in liver samples from a number of mouse strains from both sexes, by using a custom microarray designed to probe the expression levels of 25,760 exons and exonexon junctions f...
Relapsed childhood acute lymphoblastic leukemia (ALL) carries a poor prognosis despite intensive retreatment, due to intrinsic drug resistance1-2. The biological pathways that mediate resistance are unknown. Here we report the transcriptome profiles of matched diagnosis and relapse bone marrow specimens from ten pediatric B lymphoblastic leukemia patients using RNA-sequencing. Transcriptome sequencing identified 20 newly acquired novel non-synonymous mutations not present at initial diagnosis, of which two patients harbored relapse specific mutations in the same gene, NT5C2, a 5′-nucleotidase. Full exon sequencing of NT5C2 was completed in 61 additional relapse specimens, identifying five additional cases. Enzymatic analysis of mutant proteins revealed that base substitutions conferred increased enzymatic activity and resistance to treatment with nucleoside analogue therapies. Clinically, all patients who harbored NT5C2 mutations relapsed early, or within 36 months of initial diagnosis (p=0.03). These results suggest that mutations in NT5C2 are associated with the outgrowth of drug resistant clones in ALL.
High-throughput RNA sequencing (RNA-seq) dramatically expands the potential for novel genomics discoveries, but the wide variety of platforms, protocols and performance has created the need for comprehensive reference data. Here we describe the Association of Biomolecular Resource Facilities next-generation sequencing (ABRF-NGS) study on RNA-seq. We tested replicate experiments across 15 laboratory sites using reference RNA standards to test four protocols (polyA-selected, ribo-depleted, size-selected and degraded) on five sequencing platforms (Illumina HiSeq, Life Technologies’ PGM and Proton, Pacific Biosciences RS and Roche’s 454). The results show high intra-platform and inter-platform concordance for expression measures across the deep-count platforms, but highly variable efficiency and cost for splice junction and variant detection between all platforms. These data also demonstrate that ribosomal RNA depletion can both enable effective analysis of degraded RNA samples and be readily compared to polyA-enriched fractions. This study provides a broad foundation for cross-platform standardization, evaluation and improvement of RNA-seq.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.