Redefine statistical significance

Benjamin, Daniel J.; Berger, James O.; Johannesson, Magnus; Nosek, Brian A.; Wagenmakers, Eric Jan; Berk, Richard A.; Bollen, Kenneth A.; Brembs, Björn; Brown, Lawrence D.; Camerer, Colin F.; Cesarini, David; Chambers, Christopher D.; Clyde, Merlise A.; Cook, Thomas D.; Boeck, Paul De; Dienes, Zoltán; Dreber, Anna; Easwaran, Kenny; Efferson, Charles; Fehr, Ernst; Fidler, Fiona; Field, Andy P.; Forster, Malcolm R.; George, Edward I.; Gonzalez, Richard; Goodman, Steven N.; Green, Edwin; Green, Donald P.; Greenwald, Anthony G.; Hadfield, Jarrod D.; Hedges, Larry V.; Held, Leonhard; Ho, Teck H.; Hoijtink, Herbert; Hruschka, Daniel J.; Imai, Kosuke; Imbens, Guido W.; Ioannidis, John P. A.; Jeon, Minjeong; Jones, James Holland; Kirchler, Michael; Laibson, David; List, John A.; Little, Roderick J. A.; Lupia, Arthur; Machery, Édouard; Maxwell, Scott E.; McCarthy, Michael A.; Moore, Don A.; Morgan, Stephen L.; Munafò, Marcus R.; Nakagawa, Shinichi; Nyhan, Brendan; Parker, Timothy; Pericchi, Luis R.; Perugini, Marco; Rouder, Jeff; Rousseau, Judith; Savalei, Victoria; Schönbrodt, Felix D.; Sellke, Thomas; Sinclair, Betsy; Tingley, Dustin; Zandt, Trisha Van; Vazire, Simine; Watts, Duncan J.; Winship, Christopher; Wolpert, Robert L.; Xie, Yu; Young, Cristobal; Zinman, Jonathan; Johnson, Valen E.

doi:10.1038/s41562-017-0189-z

Cited by 2,099 publications

(1,650 citation statements)

References 15 publications

(11 reference statements)

Supporting

Mentioning

1,590

Contrasting

Unclassified

Order By: Relevance

“…As seen in suggested that the p-value threshold for statistically significant findings should be lowered from 0.05 to 0.005 for new discoveries 30 . In a replication context it would be relevant to apply this stricter threshold to meta-analytic results.…”

Section: -25mentioning

confidence: 99%

Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015

Camerer¹,

Dreber²,

Holzmeister³

et al. 2018

Preprint

Self Cite

283

434

View full text Add to dashboard Cite

show abstract

Section: -25mentioning

confidence: 99%

Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015

Camerer¹,

Dreber²,

Holzmeister³

et al. 2018

Preprint

Self Cite

283

434

View full text Add to dashboard Cite

show abstract

“…[8][9][10][11][12] If journals start requiring a lower threshold for significance, the power of all experiments will be further reduced, exacerbating the above problems. By way of example, suppose we are conducting a two group experiment with independent samples in each group.…”

Section: Introductionmentioning

confidence: 99%

Four simple ways to increase power without increasing the sample size

Lazic

2017

Preprint

View full text Add to dashboard Cite

Underpowered experiments have three problems: the probability of a false positive result is higher, true effects are harder to detect, and the true effects that are detected tend to have inflated effect sizes. Many biology experiments are underpowered and recent calls to change the traditional 0.05 significance threshold to a more stringent value of 0.005 will further reduce the power of the average experiment. Increasing power by increasing the sample size is often the only option considered, but more samples increases costs, makes the experiment harder to conduct, and is contrary to the 3Rs principles for animal research. We show how the design of an experiment and some analytical decisions can have a surprisingly large effect on power.

show abstract

“…From the perspective of frequentists, Pvalues only provide information pertaining to whether a null hypothesis about the extremity of an observed distribution can be rejected; they do not say anything about whether and how strongly evidence found from a specific study supports a hypothesis . Furthermore, as the current debates indicated, conventional P-value thresholds widely used in the field, particularly, p < .05, could only support very week or even could not support RUNNING HEAD: UTILIZING BAYESIAN STATISTICS 23 the presence of positive evidence (Benjamin et al, 2018). Instead, BFs show us the strength of evidence; directly BF thresholds used in the field can also be considered as better thresholds to make practical decisions about accepting a specific hypothesis based on evidence (Kass & Raftery, 1995).…”

Section: Running Head: Utilizing Bayesian Statistics 18mentioning

confidence: 99%

“…Although recent debates about the frequentist perspective in the field of quantitative methods have intensified concerns regarding how to collect and test data properly (Benjamin et al, 2018), the majority of studies in the fields related to moral education have tend to use such a perspective. We have been used to employing the methodology of frequentist, such as P-values, in empirical studies of moral education.…”

Section: Introductionmentioning

confidence: 99%

Why do we need to employ Bayesian statistics and how can we employ it in studies of moral education?: With practical guidelines to use JASP for educators and researchers

Han

Park

Thoma

2018

Journal of Moral Education

View full text Add to dashboard Cite

In this paper, we discuss the benefits of and how to utilize Bayesian statistics in studies of moral education. To demonstrate concrete examples of the applications of Bayesian statistics to studies of moral education, we reanalyzed two datasets previously collected: one small dataset collected from a moral educational intervention experiment, and one big dataset from a largescale Defining Issues Test-2 survey. Results suggest that Bayesian analysis of datasets collected from moral educational studies can provide additional useful statistical information, particularly that associated with the strength of evidence supporting alternative hypotheses, which has not been provided by the classical frequentist approach focusing on P-values. Finally, we introduce several practical guidelines pertaining to how to utilize Bayesian statistics, including the utilization of newly developed free statistical software, Jeffrey's Amazing Statistics Program (JASP), and thresholding based on Bayes Factors, to scholars in the field of moral education.

show abstract

Redefine statistical significance

Cited by 2,099 publications

References 15 publications

Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015

Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015

Four simple ways to increase power without increasing the sample size

Why do we need to employ Bayesian statistics and how can we employ it in studies of moral education?: With practical guidelines to use JASP for educators and researchers

Contact Info

Product

Resources

About