BackgroundA generally accepted approach to the analysis of RNA-Seq read count data does not yet exist. We sequenced the mRNA of 726 individuals from the Drosophila Genetic Reference Panel in order to quantify differences in gene expression among single flies. One of our experimental goals was to identify the optimal analysis approach for the detection of differential gene expression among the factors we varied in the experiment: genotype, environment, sex, and their interactions. Here we evaluate three different filtering strategies, eight normalization methods, and two statistical approaches using our data set. We assessed differential gene expression among factors and performed a statistical power analysis using the eight biological replicates per genotype, environment, and sex in our data set.ResultsWe found that the most critical considerations for the analysis of RNA-Seq read count data were the normalization method, underlying data distribution assumption, and numbers of biological replicates, an observation consistent with previous RNA-Seq and microarray analysis comparisons. Some common normalization methods, such as Total Count, Quantile, and RPKM normalization, did not align the data across samples. Furthermore, analyses using the Median, Quantile, and Trimmed Mean of M-values normalization methods were sensitive to the removal of low-expressed genes from the data set. Although it is robust in many types of analysis, the normal data distribution assumption produced results vastly different than the negative binomial distribution. In addition, at least three biological replicates per condition were required in order to have sufficient statistical power to detect expression differences among the three-way interaction of genotype, environment, and sex.ConclusionsThe best analysis approach to our data was to normalize the read counts using the DESeq method and apply a generalized linear model assuming a negative binomial distribution using either edgeR or DESeq software. Genes having very low read counts were removed after normalizing the data and fitting it to the negative binomial distribution. We describe the results of this evaluation and include recommended analysis strategies for RNA-Seq read count data.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-2353-z) contains supplementary material, which is available to authorized users.
BackgroundVariability of the VRN1 promoter region of the unique collection of spring polyploid and wild diploid wheat species together with diploid goatgrasses (donor of B and D genomes of polyploid wheats) were investigated. Accessions of wild diploid (T. boeoticum, T. urartu) and tetraploid (T. araraticum, T. timopheevii) species were studied for the first time.ResultsSequence analysis indicated great variability in the region from -62 to -221 nucleotide positions of the VRN1 promoter region. Different indels were found within this region in spring wheats. It was shown that VRN1 promoter region of B and G genome can also contain damages such as the insertion of the transposable element.Some transcription factor recognition sites including hybrid C/G-box for TaFDL2 protein known as the VRN1 gene upregulator were predicted inside the variable region. It was shown that deletions leading to promoter damage occurred in diploid and polyploid species independently. DNA transposon insertions first occurred in polyploid species. At the same time, the duplication of the promoter region was observed in A genomes of polyploid species.ConclusionsWe can conclude that supposed molecular mechanism of the VRN1 gene activating in cultivated diploid wheat species T. monococcum is common also for wild T. boeoticum and was inherited by T. monococcum. The spring polyploids are not related in their origin to spring diploids. The spring T. urartu and goatgrass accessions have another mechanism of flowering activation that is not connected with indels in VRN1 promoter region. All obtained data may be useful for detailed insight into origin of spring wheat forms in evolution and domestication process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.