2018
DOI: 10.7287/peerj.preprints.3411
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Manipulating the alpha level cannot cure significance testing

Abstract: We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p= .05 to .005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 32 publications
0
8
0
Order By: Relevance
“…Although this may reflect a high degree of experimental variation (Livak and Schmittgen, 2001) it could also reflect inherent biological variation. For simplicity, and in line with increasing calls for inferential statistics to be abandoned (McShane et al, 2017, Trafimow et al, 2018, we presented the sqRT-PCR data without statistically significance testing.…”
Section: Discussionmentioning
confidence: 99%
“…Although this may reflect a high degree of experimental variation (Livak and Schmittgen, 2001) it could also reflect inherent biological variation. For simplicity, and in line with increasing calls for inferential statistics to be abandoned (McShane et al, 2017, Trafimow et al, 2018, we presented the sqRT-PCR data without statistically significance testing.…”
Section: Discussionmentioning
confidence: 99%
“…In this case, we can refer to a number of current recommendations from the literature (Ioannidis, 2018). For example, Benjamin et al (2018) recommended that the usual threshold for statistical significance be generally lowered from 5% to 0.5%, which has been dismissed by others (e.g., Trafimow et al, 2018). In other publications, it was suggested not to use p values anymore but rely on alternative methods instead.…”
Section: The Reproducibility Crisis: P Values and Significance Thrementioning
confidence: 99%
“…Dichotomization in conjunction with misleading terminology propagate cognitive biases that seduce researchers to make logically inconsistent and overconfident inferences, both when p is below and when it is above the "significance" threshold. The following errors seem to be particularly widespread: 1 1) use of p-values when there is neither random sampling nor randomization 2) confusion of statistical and practical significance or complete neglect of effect size 3) unwarranted binary statements of there being an effect as opposed to no effect, coming along with -misinterpretations of p-values below 0.05 as posterior probabilities of the null hypothesis -mixing up of estimating and testing and misinterpretation of "significant" results as evidence confirming the coefficients/effect sizes estimated from a single sample treatment of "statistically non-significant" effects as being zero (confirmation of the null) 4) inflation of evidence caused by unconsidered multiple comparisons and p-hacking 5) inflation of effect sizes caused by considering "significant" results only 1 See, for example, McCloskey and Ziliak (1996), Sellke et al (2001), Ioannidis (2005), Ziliak and McCloskey (2008), Krämer (2011), Ioannidis and Doucouliagos (2013), Kline (2013), Colquhoun (2014), Gelman and Loken (2014), Motulsky (2014), Vogt et al (2014), Gigerenzer and Marewski (2015), Greenland et al (2016), Hirschauer et al (2016;2018), Wasserstein and Lazar (2016), Ziliak (2016), Amrhein et al (2017), and Trafimow et al (2018). This list contains but a small selection of the literature on p-value misconceptions from the last 20 years.…”
Section: Introductionmentioning
confidence: 99%
“…While Berry (2017: 896) might be pushing too hard when claiming that a pvalue "as such has no inferential content," one must recognize that it is but a graded measure of the strength of evidence against the null, but only in the sense that small p-values will occur more often if there is an effect compared to no effect (Hirschauer et al 2018). 2 Joining Berry (2017), Gelman and Carlin (2017), Greenland (2017), McShane and Gal (2017), McShane et al (2017), Trafimow et al (2018), and many others, we believe that degrading the p-value's continuous message into binary "significance" declarations ("bright line rules") is at the heart of the problem. 3 Since the p-value is deeply anchored in the minds of most scientists including economists, we believe that demanding drastic changes, such as renouncing p-values or replacing frequentist approaches by Bayesian methods, is not the most promising way to guard researchers from the inferential errors that we see today.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation