RNA-Seq experiments allow genome-wide estimation of relative gene expression. Estimation of gene expression at different time points generates time expression profiles of phenomena of interest, as for example fruit development. However, such profiles can be complex to analyze and interpret. We developed a methodology that transforms original RNA-Seq data from time course experiments into standardized expression profiles, which can be easily interpreted and analyzed. To exemplify this methodology we used RNA-Seq data obtained from 12 accessions of chili pepper (Capsicum annuum L.) during fruit development. All relevant data, as well as functions to perform analyses and interpretations from this experiment, were gathered into a publicly available R package: “Salsa”. Here we explain the rational of the methodology and exemplify the use of the package to obtain valuable insights into the multidimensional time expression changes that occur during chili pepper fruit development. We hope that this tool will be of interest for researchers studying fruit development in chili pepper as well as in other angiosperms.
Background: Open data sharing is instrumental for the advance of biological sciences. Gene expression is the primary molecular phenotype, usually estimated through RNA-Seq experiments. Large scope interpretation of RNA-Seq results is complicated by the extensive gene expression range, as well as by the diversity of biological sources and experimental treatments. Here we present “Salsa”, an auto-contained R package for extracting useful knowledge about gene expression during the development of chili pepper fruit. Methods and Results: Data from 168 RNA-Seq libraries, comprising more than 3.4 billion reads, were analyzed and curated to represent standardized expression profiles (SEPs) for all genes expressed during fruit development in 12 chili pepper accessions. Accessions have representatives of domesticated varieties, wild ancestors and crosses, covering a broad spectrum of genotypes. Data are organized in a relational way, and functions allow data mining from the level of single genes up to whole genomes, grouping profiles by different criteria. Those include any combination of expression model, accession, protein description and gene ontology (GO) term, among others. Also, GO enrichment analysis can be performed over any set of genes. Conclusions: “Salsa” opens endless possibilities for mining the transcriptome of chili pepper during fruit development.
Gene co-expression networks are powerful tools to understand functional interactions between genes. However, large co-expression networks are difficult to interpret and do not guarantee that the relations found will be true for different genotypes. Statistically verified time expression profiles give information about significant changes in expressions through time, and genes with highly correlated time expression profiles, which are annotated in the same biological process, are likely to be functionally connected. A method to obtain robust networks of functionally related genes will be useful to understand the complexity of the transcriptome, leading to biologically relevant insights. We present an algorithm to construct gene functional networks for genes annotated in a given biological process or other aspects of interest. We assume that there are genome-wide time expression profiles for a set of representative genotypes of the species of interest. The method is based on the correlation of time expression profiles, bound by a set of thresholds that assure both, a given false discovery rate, and the discard of correlation outliers. The novelty of the method consists in that a gene expression relation must be repeatedly found in a given set of independent genotypes to be considered valid. This automatically discards relations particular to specific genotypes, assuring a network robustness, which can be set a priori. Additionally, we present an algorithm to find transcription factors candidates for regulating hub genes within a network. The algorithms are demonstrated with data from a large experiment studying gene expression during the development of the fruit in a diverse set of chili pepper genotypes. The algorithm is implemented and demonstrated in a new version of the publicly available R package “Salsa” (version 1.0).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.