Manipulating the alpha level cannot cure significance testing – comments on "Redefine statistical significance"

Trafimow, David; Amrhein, Valentin; Areshenkoff, Corson N.; Barrera-Causil, Carlos; Beh, Eric J.; Bilgiç, Yusuf; Bono, Roser; Bradley, Michael T.; Briggs, William M.; Cepeda-Freyre, Héctor A.; Chaigneau, Sergio E.; Ciocca, Daniel R.; Correa, Juan C.; Cousineau, Denis; Boer, Michiel R. de; Dhar, Subhra Sankar; Dolgov, Igor; Gómez‐Benito, Juana; Grendár, Marian; Grice, James W.; Guerrero-Gimenez, Martin E.; Gutiérrez, Andrés; Huedo‐Medina, Tania B.; Jaffé, Klaus; Janyan, Armina; Karimnezhad, Ali; Korner‐Nievergelt, Fränzi; Kosugi, Koji; Lachmair, Martin; Ledesma, Rubén Daniel; Limongi, Roberto; Liuzza, Marco Tullio; Lombardo, Rosaria; Marks, Michael J.; Meinlschmidt, Gunther; Nalborczyk, Ladislas; Nguyen, Hung T.; Ospina, Raydonal; Perezgonzalez, J; Pfister, Roland; Rahona, Juan José; Rodríguez-Medina, David Alberto; Romão, Xavier; Ruiz-Fernández, Susana; Suarez, Isabel; Tegethoff, Marion; Tejo, Mauricio; Schoot, Rens van de; Vankov, Ivan; Velasco-Forero, Santiago; Wang, Tonghui; Yamada, Yuki; Zoppino, Felipe Carlos Martín; Marmolejo‐Ramos, Fernando

doi:10.7287/peerj.preprints.3411v1

Cited by 4 publications

(10 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Statistics is often more misleading than useful (17,18). Comparing the randomly generated results with the empirically obtained frequency distribution produces probability values of rejecting the null hypothesis close to absolute 0.…”

Section: Methodsmentioning

confidence: 97%

Quantifying the prevalence of assortative mating in a human population

Jaffé

2019

Preprint

Self Cite

View full text Add to dashboard Cite

For the first time, empirical evidence allowed to construct the frequency distribution an index related to the degree of genetic relatedness between the parents from about 500000 humans living in the UK. The results show that a large proportion of the population is not the product of parents choosing a mate randomly. Assortative mating leading to offspring, that occurs between genetic related individuals, is very common. High degrees of genetic relatedness, i.e. extreme inbreeding, is avoided. The evidence shows that assortative is highly prevalent in this large population sample. This result suggests that assuming random mating, as widely done in population genetic studies, is not rational.

show abstract

Section: Methodsmentioning

confidence: 97%

Quantifying the prevalence of assortative mating in a human population

Jaffé

2019

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…“I find it especially troubling”—she continues—“to spoze an error statistician…ought to use a Bayes Factor as the future gold standard for measuring his error statistical tool…even though Bayes Factors don’t control or measure error probabilities” (2017b). Furthermore, Mayo pinpoints the (old) fallacy of transposing the conditional, whereby the (error) probability of a test is confused with the (posterior) probability of a belief (also Trafimow et al , 2017 ). And despite “60 years (sic) old…demonstrations [showing] that with reasonable tests and reasonable prior probabilities, the disparity vanishes…they still mean different things” (2017c).…”

Section: Counter-argumentsmentioning

confidence: 99%

Retract p < 0.005 and propose using JASP, instead

Perezgonzalez

Frías-Navarro

2018

F1000Res

Self Cite

View full text Add to dashboard Cite

Seeking to address the lack of research reproducibility in science, including psychology and the life sciences, a pragmatic solution has been raised recently: to use a stricter < 0.005 standard for statistical significance when p claiming evidence of new discoveries. Notwithstanding its potential impact, the proposal has motivated a large mass of authors to dispute it from different philosophical and methodological angles. This article reflects on the original argument and the consequent counterarguments, and concludes with a simpler and better-suited alternative that the authors of the proposal knew about and, perhaps, should have made from their Jeffresian perspective: to use a Bayes factors analysis in parallel (e.g., via JASP) in order to learn more about frequentist error statistics and about Bayesian prior and posterior beliefs without having to mix inconsistent research philosophies. Keywords ArgumentSeeking to address the lack of research reproducibility due to the high rate of false positives in the literature, Benjamin et al. (2017a); Benjamin et al. (2017b) propose a pragmatic solution which "aligns with the training undertaken by many researchers, and might quickly achieve broad acceptance" (also Savehn, 2017): to use a stricter p < 0.005 standard for statistical significance when claiming evidence of new discoveries.The proposal is subject to several constrains in its application: (1) to claims of discovery of new effects (thus, not necessarily to replication studies); (2) when using null hypothesis significance testing (arguably Fisher's approach, perhaps even Neyman-Pearson's, but excluding other p-value-generating approaches such as resampling); (3) in fields with too flexible standards (namely 5% or above); (4) when the prior odds of alternative-to-null hypothesis is in the range 1-to-5, to 1-to-40 (stricter standards are required with lower odds); (5) for researcher's consumption (thus, not a standard for journal rejection, although "journals can help transition to the new statistical significance threshold"; also, "journals editors and funding institutions could easily enforce the proposal", Wagenmakers, 2017; and "its implementation only requires journal editors to agree on the new threshold", Machery, 2017); (6) while still keeping findings with probability up to 5% as suggestive (and meriting publication if so "properly labelled"); (7) despite many of the proponents believing that the proposal is nonsense, anyway (that is, it is a quick fix, not a credible one; also Ioannidis in Easwaran, 2017; Resnick, 2017; Wagenmakers, 2017; Wagenmakers & Gronau, 2017). Amendments from Version 1Minor changes incorporating reviewers' recommendations:The legend in Figure 1 now defines the acronyms in the figure.• [2] A new reference to Perezgonzalez (2015) now implies that the pseudoscientific label attached to the NHST element ( Figure 1) follows from the rhetoric in such reference.• [3] A second note clarifies that JASP also allows to use Cauchy, Normal and t-distributions as informed prio...

show abstract

“…Lakens et al (2017) add lack of experimental redundancy, logical traps, research opacity, and poor accounting of sources of error, as well as the risks of reduced generalisability and research breadth were Benjamin et al ’s proposal to succeed. Methodological concerns were also raised by Amrhein & Greenland (2017); Black (2017); Byrd (2017); Chapman (2017); Crane (2017); Ferreira & Henderson (2017); Greenland (2017); Hamlin (2017); Kong (2017); Lew (2017); Llewelyn (2017); Martin (2017); McShane et al (2017); Passin (2017); Steltenpohl (2017); Trafimow et al (2017); Young (2017); Zollman (2017); and Morey (2017). Some researchers even propose the use of preregistration as a way of minimizing above problems ( Hamlin, 2017; Llewelyn, 2017; van der Zee, 2017)…”

Section: Counter-argumentsmentioning

confidence: 99%

“…Addressing replication directly, Chapman (2017) and Trafimow et al (2017) point out that the problem with replication is not too many false positives but insufficient power. Krueger (2017; also McShane et al , 2017, Trafimow et al , 2017) chides Benjamin et al for the incoherence of considering replication as order-dependent and inverting the exploratory-confirmatory nature of replication by proposing to make the former more difficult to achieve and the latter more liberal.…”

Section: Counter-argumentsmentioning

confidence: 99%

“…Krueger (2017; also McShane et al , 2017, Trafimow et al , 2017) chides Benjamin et al for the incoherence of considering replication as order-dependent and inverting the exploratory-confirmatory nature of replication by proposing to make the former more difficult to achieve and the latter more liberal. He further chides them on whether they would disallow their own past findings at the 5% threshold.…”

Section: Counter-argumentsmentioning

confidence: 99%

See 1 more Smart Citation