2015
DOI: 10.1002/smj.2459
|View full text |Cite
|
Sign up to set email alerts
|

Scientific apophenia in strategic management research: Significance tests & mistaken inference

Abstract: Research summary: This article uses distributional matching and posterior predictive checks to estimate the extent of false and inflated findings in empirical research on strategic management. Based on a sample of 300 papers in top outlets for research on strategic management, we estimate that if each study were repeated, 24–40 percent of significant coefficients would become insignificant at the five percent level. Our best guess is that for about half of these, the true coefficient is very close to 0. The re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
106
0
2

Year Published

2016
2016
2017
2017

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 119 publications
(109 citation statements)
references
References 19 publications
(39 reference statements)
1
106
0
2
Order By: Relevance
“…They found that average effect sizes were considerably smaller than originally reported. Closer to home, Goldfarb and King (2016) assessed a sample of 300 published studies. They estimated that 24-40% of the studies could not be replicated.…”
Section: Research Rigor Revisitedmentioning
confidence: 99%
“…They found that average effect sizes were considerably smaller than originally reported. Closer to home, Goldfarb and King (2016) assessed a sample of 300 published studies. They estimated that 24-40% of the studies could not be replicated.…”
Section: Research Rigor Revisitedmentioning
confidence: 99%
“…Despite its many flaws, null hypothesis significance testing (NHST) continues to be the choice of researchers in management and organization studies (Bettis, Ethiraj, Gambardella, Helfat, & Mitchell, 2016;Meyer et al, 2017). In NHST, the tenability of a null hypothesis (i.e., no effect or relation) is primarily judged based on the observed p value associated with the test of the hypothesis, and values smaller than 0.05 are often judged as providing sufficient evidence to reject it (Bettis et al, 2016;Goldfarb & King, 2016). Of the many problems associated with this interpretation of p values, the most pernicious is that it motivates researchers to engage in a practice called ''p-hacking'' and to report ''crippled'' p values (see below) (Aguinis, Werner, Abbott, Angert, Park, & Kohlhausen, 2010;Banks, Rogelberg et al, 2016).…”
Section: Reporting Of P Valuesmentioning
confidence: 99%
“…For example, consider a researcher who interprets p = 0.0499 as sufficient evidence for rejecting the null hypothesis, and p = 0.0510 as evidence that the null hypothesis should be retained, and believes that journals are more likely to look favorably on rejected null hypotheses. This researcher will be highly motivated to ''p-hack,'' that is, find some way, such as using control variables or eliminating outliers, to reduce the p value below the 0.05 threshold (Aguinis et al, 2010, Goldfarb & King, 2016Starbuck, 2016;Waldman & Lilienfeld, 2016). Similarly, this researcher will be motivated to report p values using cutoffs (e.g., p \ 0.05), rather that report the actual p value (0.0510).…”
Section: Reporting Of P Valuesmentioning
confidence: 99%
See 1 more Smart Citation
“…Based on 300 articles in prominent strategic management journals, Goldfarb and King (2015) estimated conservatively that about 25-40% of the published claims of statistical significance are actually false. Such audits strongly suggest that researchers or editors do not publish studies that report null-findings (Kepes et al, 2012).…”
Section: Three Important Types Of Little Liesmentioning
confidence: 99%