2018
DOI: 10.1101/458786
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A practical guide to methods controlling false discoveries in computational biology

Abstract: Background: In high-throughput studies, hundreds to millions of hypotheses are typically tested. Statistical methods that control the false discovery rate (FDR) have emerged as popular and powerful tools for error rate control. While classic FDR methods use only p-values as input, more modern FDR methods have been shown to increase power by incorporating complementary information as "informative covariates" to prioritize, weight, and group hypotheses. However, there is currently no consensus on how the modern … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
106
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 70 publications
(110 citation statements)
references
References 68 publications
(56 reference statements)
3
106
0
1
Order By: Relevance
“…Further refinements to the results can be obtained by activating independent filtering (Bourgon et al, 2010), or selecting the more powerful Independent Hypothesis Weighting (IHW) framework (Ignatiadis et al, 2016), to ameliorate the multiple testing issue by incorporating an informative covariate, e.g. the mean gene expression (Korthauer et al, 2019). Shrinkage of the effect sizes is also optionally performed on the log fold change estimates, to reflect the higher levels of uncertainty for lowly expressed genes.…”
Section: Generating and Exploring The Results For Differential Expresmentioning
confidence: 99%
“…Further refinements to the results can be obtained by activating independent filtering (Bourgon et al, 2010), or selecting the more powerful Independent Hypothesis Weighting (IHW) framework (Ignatiadis et al, 2016), to ameliorate the multiple testing issue by incorporating an informative covariate, e.g. the mean gene expression (Korthauer et al, 2019). Shrinkage of the effect sizes is also optionally performed on the log fold change estimates, to reflect the higher levels of uncertainty for lowly expressed genes.…”
Section: Generating and Exploring The Results For Differential Expresmentioning
confidence: 99%
“…On the other hand, the permutation procedure incorporated in many gene set tests has been shown to be biased [45], and inaccurate if permutation p-values are reported as zero [46]. Recent studies also reported nonuniform p-value distribution that are either systematically biased towards zero (false positive inflation) or one (false negative inflation) [47,48]. These shortcomings can lead to inappropriately small or large fractions of significant gene sets, and can considerably impair prioritization of gene sets in practice.…”
Section: Discussionmentioning
confidence: 99%
“…We considered a subset of microbiome data from the Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA), where samples were acquired from monitoring wells in a site contaminated by former waste disposal ponds and all sampled wells have various geochemical and physical measurements 17,18 . Following the original study, we performed two experiments to test for correlations between the operational taxonomic units (OTUs) and the pH, Al respectively.…”
Section: Microbiome Datamentioning
confidence: 99%
“…For the auditory experiment, the Brodmann areas corresponding to auditory cortices, namely 41,42,22, are among areas where the alternative hypotheses are most likely to occur. For the tennis imagination experiment, multiple cortices seem to respond to this stimulus, including auditory cortex (42), visual cortices (18,19), and motor cortices (4,6,7).…”
Section: Fmri Datamentioning
confidence: 99%
See 1 more Smart Citation