Comparison of small n statistical tests of differential expression applied to microarrays

Murie, Carl; Woody, Owen Z.; Lee, Anna Y.; Nadon, Robert

doi:10.1186/1471-2105-10-45

Cited by 72 publications

(58 citation statements)

References 43 publications

(65 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Some of the microarray studies we looked at have small sample sizes, which gives rise to the possibility of poor random error estimates and inaccurate statistical tests for differential expression. For this reason, we selected limma t -statistics, an empirical Bayes method [52], which is reportedly one of the most effective methods for differential expression analysis even for very small data sets [53]. To find the combined significance of the pathways across multiple diseases, we used Fisher’s combined probability test [39], because, it gives a single test of significance for a number of not-so-correlated tests of significance performed on very heterogeneous data sets.…”

Section: Discussionmentioning

confidence: 99%

Integrative analysis of genetic data sets reveals a shared innate immune component in autism spectrum disorder and its co-morbidities

et al. 2016

View full text Add to dashboard Cite

BackgroundAutism spectrum disorder (ASD) is a common neurodevelopmental disorder that tends to co-occur with other diseases, including asthma, inflammatory bowel disease, infections, cerebral palsy, dilated cardiomyopathy, muscular dystrophy, and schizophrenia. However, the molecular basis of this co-occurrence, and whether it is due to a shared component that influences both pathophysiology and environmental triggering of illness, has not been elucidated. To address this, we deploy a three-tiered transcriptomic meta-analysis that functions at the gene, pathway, and disease levels across ASD and its co-morbidities.ResultsOur analysis reveals a novel shared innate immune component between ASD and all but three of its co-morbidities that were examined. In particular, we find that the Toll-like receptor signaling and the chemokine signaling pathways, which are key pathways in the innate immune response, have the highest shared statistical significance. Moreover, the disease genes that overlap these two innate immunity pathways can be used to classify the cases of ASD and its co-morbidities vs. controls with at least 70 % accuracy.ConclusionsThis finding suggests that a neuropsychiatric condition and the majority of its non-brain-related co-morbidities share a dysregulated signal that serves as not only a common genetic basis for the diseases but also as a link to environmental triggers. It also raises the possibility that treatment and/or prophylaxis used for disorders of innate immunity may be successfully used for ASD patients with immune-related phenotypes.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-016-1084-z) contains supplementary material, which is available to authorized users.

show abstract

Section: Discussionmentioning

confidence: 99%

Integrative analysis of genetic data sets reveals a shared innate immune component in autism spectrum disorder and its co-morbidities

et al. 2016

View full text Add to dashboard Cite

show abstract

“…Following normalization, average signal intensity of the probes showing at least 30% change in expression across the 3 donors were computed and ratios (treated/untreated) were log 2 transformed. Statistical analysis of the data was performed using Cyber-T regularized t statistic [14] due to small sample size (n = 3) since it takes into account Bayesian estimate of variance by pooling across genes with similar intensities [15]. The functional annotation tool (DAVID Bioinformatics Resources 6.7) was used to determine the biological relevance of the data and molecular functions represented by differentially regulated genes [16], enabling us to explore and clarify the biological process, by considering a p-value ≤0.05 as significant.…”

Section: Microarray Expression Data Analysismentioning

confidence: 99%

Gene Expression Profiling of Human c-Kit Mutant D816V

Sharma¹,

Gangenahalli²

2016

JCT

View full text Add to dashboard Cite

The tyrosine kinase receptor III, c-Kit/stem cell factor receptor and its ligand, human stem cell factor (huSCF) are the predominant regulator of mitogenesis in the hematopoietic stem and progenitor cells. However, gain-of-function mutations alter c-Kit auto-regulatory mechanisms to aberrant c-Kit signaling, leading to the onset or progression of cancerous transformations. The most common mutation of c-Kit is the substitution of aspartic acid residue in position 816 to valine (D816V), which is majorly responsible for its ligand-independent constitutive activation, and is implicated in hematopoietic malignancies. Currently, molecular targeted therapy is increasingly becoming a hot spot due to its specificity and low toxicity. As the molecular mechanisms responsible for D816V-c-Kit mediated tumorogenicity are largely unknown, in this study, we aimed to in-

show abstract

“…More empirical alternatives include the use of re-sampling methods (to compare genes from small subsets of samples and those from the full dataset) [3], [19], and the use of spike-in data for which a set of genes are differentially expressed by design [12], [20]. Finally Jeffery et al [18] explore an indirect approach by assessing classification performance obtained with genes resulting from the application of the methods to compare.…”

Section: Introductionmentioning

confidence: 99%

Should We Abandon the t-Test in the Analysis of Gene Expression Microarray Data: A Comparison of Variance Modeling Strategies

et al. 2010

View full text Add to dashboard Cite

High-throughput post-genomic studies are now routinely and promisingly investigated in biological and biomedical research. The main statistical approach to select genes differentially expressed between two groups is to apply a t-test, which is subject of criticism in the literature. Numerous alternatives have been developed based on different and innovative variance modeling strategies. However, a critical issue is that selecting a different test usually leads to a different gene list. In this context and given the current tendency to apply the t-test, identifying the most efficient approach in practice remains crucial. To provide elements to answer, we conduct a comparison of eight tests representative of variance modeling strategies in gene expression data: Welch's t-test, ANOVA [1], Wilcoxon's test, SAM [2], RVM [3], limma [4], VarMixt [5] and SMVar [6]. Our comparison process relies on four steps (gene list analysis, simulations, spike-in data and re-sampling) to formulate comprehensive and robust conclusions about test performance, in terms of statistical power, false-positive rate, execution time and ease of use. Our results raise concerns about the ability of some methods to control the expected number of false positives at a desirable level. Besides, two tests (limma and VarMixt) show significant improvement compared to the t-test, in particular to deal with small sample sizes. In addition limma presents several practical advantages, so we advocate its application to analyze gene expression data.

show abstract

Comparison of small n statistical tests of differential expression applied to microarrays

Cited by 72 publications

References 43 publications

Integrative analysis of genetic data sets reveals a shared innate immune component in autism spectrum disorder and its co-morbidities

Integrative analysis of genetic data sets reveals a shared innate immune component in autism spectrum disorder and its co-morbidities

Gene Expression Profiling of Human c-Kit Mutant D816V

Should We Abandon the t-Test in the Analysis of Gene Expression Microarray Data: A Comparison of Variance Modeling Strategies

Contact Info

Product

Resources

About