We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the United States Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed, for these and qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.
RNA-seq facilitates unbiased genome-wide gene-expression profiling. However, its concordance with the well-established microarray platform must be rigorously assessed for confident uses in clinical and regulatory application. Here we use a comprehensive study design to generate Illumina RNA-seq and Affymetrix microarray data from the same set of liver samples of rats under varying degrees of perturbation by 27 chemicals representing multiple modes of action (MOA). The cross-platform concordance in terms of differentially expressed genes (DEGs) or enriched pathways is highly correlated with treatment effect size, gene-expression abundance and the biological complexity of the MOA. RNA-seq outperforms microarray (90% versus 76%) in DEG verification by quantitative PCR and the main gain is its improved accuracy for low expressed genes. Nonetheless, predictive classifiers derived from both platforms performed similarly. Therefore, the endpoint studied and its biological complexity, transcript abundance, and intended application are important factors in transcriptomic research and for decision-making.
BackgroundGene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model.ResultsWe generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines, and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models.ConclusionsWe demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-015-0694-1) contains supplementary material, which is available to authorized users.
Gnetophytes are an enigmatic gymnosperm lineage comprising three genera, Gnetum, Welwitschia and Ephedra, which are morphologically distinct from all other seed plants. Their distinctiveness has triggered much debate as to their origin, evolution and phylogenetic placement among seed plants. To increase our understanding of the evolution of gnetophytes, and their relation to other seed plants, we report here a high-quality draft genome sequence for Gnetum montanum, the first for any gnetophyte. By using a novel genome assembly strategy to deal with high levels of heterozygosity, we assembled >4 Gb of sequence encoding 27,491 protein-coding genes. Comparative analysis of the G. montanum genome with other gymnosperm genomes unveiled some remarkable and distinctive genomic features, such as a diverse assemblage of retrotransposons with evidence for elevated frequencies of elimination rather than accumulation, considerable differences in intron architecture, including both length distribution and proportions of (retro) transposon elements, and distinctive patterns of proliferation of functional protein domains. Furthermore, a few gene families showed Gnetum-specific copy number expansions (for example, cellulose synthase) or contractions (for example, Late Embryogenesis Abundant protein), which could be connected with Gnetum's distinctive morphological innovations associated with their adaptation to warm, mesic environments. Overall, the G. montanum genome enables a better resolution of ancestral genomic features within seed plants, and the identification of genomic characters that distinguish Gnetum from other gymnosperms. NATuRe PLANTS ArticlesNATurE PLANTs phylogenetic position of gnetophytes, with topologies differing depending on the type of sequence data (for example, plastid versus nuclear genes, nucleotide versus amino acid data) and analytical approach used (for example, maximum parsimony, maximum likelihood, Bayesian, multispecies coalescent based methods) [6][7][8] . Consequently, several possible hypotheses have been put forward that place gnetophytes as sister to (1) Pinaceae ('Gnepine' hypothesis); (2) cupressophytes ('Gnecup' hypothesis); (3) all conifers ('Gnetifer' hypothesis); (4) all other gymnosperms; or (5) all seed plants 9 . Currently, the emerging consensus, based on both older and more recent studies, and recently released data from the 1KP initiative (see https://sites.google.com/a/ualberta.ca/onekp/, and Wickett et al. 8 ), indicates that gnetophytes are sister to, or within, the conifers.So far, the availability of whole genome sequences for gymnosperms has been limited to conifers (specifically to Pinaceae) [10][11][12][13] and G. biloba 14 , with no whole genome assemblies available for the two remaining major seed plant lineages-cycads and gnetophytes. This deficiency, together with the conflicting phylogenetic evidence for relationships among these groups, is impeding our understanding of genome evolution across all seed plants. Here, we present a high-quality draft genome of Gnetum ...
There is a critical need for standard approaches to assess, report and compare the technical performance of genome-scale differential gene expression experiments. Here we assess technical performance with a proposed standard 'dashboard' of metrics derived from analysis of external spike-in RNA control ratio mixtures. These control ratio mixtures with defined abundance ratios enable assessment of diagnostic performance of differentially expressed transcript lists, limit of detection of ratio (LODR) estimates and expression ratio variability and measurement bias. The performance metrics suite is applicable to analysis of a typical experiment, and here we also apply these metrics to evaluate technical performance among laboratories. An interlaboratory study using identical samples shared among 12 laboratories with three different measurement processes demonstrates generally consistent diagnostic power across 11 laboratories. Ratio measurement variability and bias are also comparable among laboratories for the same measurement process. We observe different biases for measurement processes using different mRNA-enrichment protocols.
Detailed investigations on Lower Cretaceous Ephedra L. fossils (Gnetopsida) reveal morphological characters similar to those of extant Ephedra rhytidosperma Pachomova, including articulate branches with many fine longitudinal striations, a dichasial branching pattern, uni- or bi-ovulate cones with paired bracts, cones terminal on branchlets, and seeds with a short, straight micropylar tubes, covered by numerous regular and prominent transverse laminar protuberances. Fossils are similar to extant E. rhytidosperma reproductive organs but differ in some vegetative structures and are described and discussed here as Ephedra archaeorhytidosperma Y. Yang et al. Because E. rhytidosperma is currently considered one of the most specialized members in Ephedra L. section Pseudobaccatae Stapf, the occurrence of E. archaeorhytidosperma in the Yixian Formation suggests that Ephedra L. was perhaps a more diverse genus in the Lower Cretaceous. Perhaps the evolution and diversity of Ephedra L. was already in place by the Lower Cretaceous and certainly before the end of the Mesozoic.
SummaryViral infections cause plant chlorosis, stunting, necrosis or other symptoms. The down-regulation of chloroplast-related genes (ChRGs) is assumed to be responsible for chlorosis.We identified the differentially expressed genes (DEGs) in Rice stripe virus (RSV)-infected Nicotiana benthamiana, and examined the contribution of 75 down-regulated DEGs to RSV symptoms by silencing them one by one using Tobacco rattle virus (TRV)-induced gene silencing.Silencing of 11 of the 75 down-regulated DEGs caused plant chlorosis, and nine of the 11 were ChRGs. Silencing of a down-regulated DEG encoding the eukaryotic translation initiation factor 4A (eIF4A) caused leaf-twisting and stunting that were visible on RSV-infected N. benthamiana. A region of RSV RNA4 was complementary to part of eIF4A mRNA and virus-derived small interfering (vsiRNAs) from that region were present in infected N. benthamiana. When expressed as artificial microRNAs, those vsiRNAs could target NbeIF4A mRNA for regulation.We provide experimental evidence supporting the association of ChRGs with chlorosis and show that eIF4A is involved in RSV symptom development. This is also the first report demonstrating that siRNA derived directly from a plant virus can target a host gene for regulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.