2018
DOI: 10.1525/collabra.108
|View full text |Cite
|
Sign up to set email alerts
|

Testing Significance Testing

Abstract: The practice of Significance Testing (ST) remains widespread in psychological science despite continual criticism of its flaws and abuses. Using simulation experiments, we address four concerns about ST and for two of these we compare ST's performance with prominent alternatives. We find the following: First, the p values delivered by ST predict the posterior probability of the tested hypothesis well under many research conditions. Second, low p values support inductive inferences because they are most likely … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
5
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 65 publications
2
5
0
Order By: Relevance
“…We used non-parametric tests for the hypothesis tests when assumptions for parametric tests were not met, and, where appropriate, performed Bayesian hypothesis evaluation. All parametric, nonparametric, and Bayesian analyses supported the same substantive conclusions, a congruence we anticipated (Krueger & Heck, 2018).…”
Section: Resultssupporting
confidence: 68%
“…We used non-parametric tests for the hypothesis tests when assumptions for parametric tests were not met, and, where appropriate, performed Bayesian hypothesis evaluation. All parametric, nonparametric, and Bayesian analyses supported the same substantive conclusions, a congruence we anticipated (Krueger & Heck, 2018).…”
Section: Resultssupporting
confidence: 68%
“…Both p values and Bayes factors can be calculated from the t-statistic and the sample size, so it is expected that they would be related. In these simulations, there was a nearperfect linear relationship between the (log of the) Bayes factors and the (log of the) p values, as has been shown previously (Benjamin et al, 2017;Krueger & Heck, 2018;Wetzels et al, 2011). Equivalency in AUCs between Bayes factors and p values generalized to other scenarios as well including one-sample t-tests and correlations (see Figure 8).…”
Section: Bayes Factor >3supporting
confidence: 79%
“…Put another way, the discriminability of p values and Bayes factors are high in situations for which real effects are likely and in situations for which real effects are unlikely. Obviously, more p values and Bayes factors reach thresholds for significance when there are more significant effects, so "significant" effects are more for 'safe' studies than 'risky' studies (Krueger & Heck, 2018). Nevertheless, the diagnosticity of the p value (and of Bayes factor) is high regardless of the likelihood of finding a real effect.…”
Section: Equationmentioning
confidence: 99%
“…We used two approaches to hypothesis testing (Primoceri et al, 2021) with the general expectation that they would yield converging conclusions (Krueger & Heck, 2018).…”
Section: Resultsmentioning
confidence: 99%