We propose to change the default P-value threshold for statistical significance from 0.05 to 0.005 for claims of new discoveries. T he lack of reproducibility of scientific studies has caused growing concern over the credibility of claims of new discoveries based on 'statistically significant' findings. There has been much progress toward documenting and addressing several causes of this lack of reproducibility (for example, multiple testing, P-hacking, publication bias and under-powered studies). However, we believe that a leading cause of non-reproducibility has not yet been adequately addressed: statistical standards of evidence for claiming new discoveries in many fields of science are simply too low. Associating statistically significant findings with P < 0.05 results in a high rate of false positives even in the absence of other experimental, procedural and reporting problems.For fields where the threshold for defining statistical significance for new discoveries is P < 0.05, we propose a change to P < 0.005. This simple step would immediately improve the reproducibility of scientific research in many fields. Results that would currently be called significant but do not meet the new threshold should instead be called suggestive. While statisticians have known the relative weakness of using P ≈ 0.05 as a threshold for discovery and the proposal to lower it to 0.005 is not new 1,2 , a critical mass of researchers now endorse this change.We restrict our recommendation to claims of discovery of new effects. We do not address the appropriate threshold for confirmatory or contradictory replications of existing claims. We also do not advocate changes to discovery thresholds in fields that have already adopted more stringent standards (for example, genomics and high-energy physics research; see the 'Potential objections' section below).We also restrict our recommendation to studies that conduct null hypothesis significance tests. We have diverse views about how best to improve reproducibility, and many of us believe that other ways of summarizing the data, such as Bayes factors or other posterior summaries based on clearly articulated model assumptions, are preferable to P values. However, changing the P value threshold is simple, aligns with the training undertaken by many researchers, and might quickly achieve broad acceptance.
"We propose to change the default P-value threshold forstatistical significance for claims of new discoveries from 0.05 to 0.005."
Interpersonal communication presents a methodological challenge and a research opportunity for researchers involved in field experiments. The challenge is that communication among subjects blurs the line between treatment and control conditions. When treatment effects are transmitted from subject to subject, the stable unit treatment value assumption (SUTVA) is violated, and comparison of treatment and control outcomes may provide a biased assessment of the treatment’s causal influence. Social scientists are increasingly interested in the substantive phenomena that lead to SUTVA violations, such as communication in advance of an election. Experimental designs that gauge SUTVA violations provide useful insights into the extent and influence of interpersonal communication. This article illustrates the value of one such design, a multilevel experiment in which treatments are randomly assigned to individuals and varying proportions of their neighbors. After describing the theoretical and statistical underpinnings of this design, we apply it to a large‐scale voter‐mobilization experiment conducted in Chicago during a special election in 2009 using social‐pressure mailings that highlight individual electoral participation. We find some evidence of within‐household spillovers but no evidence of spillovers across households. We conclude by discussing how multilevel designs might be employed in other substantive domains, such as the study of deterrence and policy diffusion.
Investigations of American politics have increasingly turned to analyses of political networks to understand public opinion, voting behavior, the diffusion of policy ideas, bill sponsorship in the legislature, interest group coalitions and influence, party factions, institutional development, and other empirical phenomena. While the association between political networks and political behavior is well established, clear causal inferences are often difficult to make. This article consists of five independent essays that address practical problems in making causal inferences from studies of political networks. They consider egocentric studies of national probability samples, sociocentric studies of political communities, measurement error in elite surveys, field experiments on networks, and triangulating on causal processes.
Identifying causal effects attributable to network membership is a key challenge in empirical studies of social networks. In this article, we examine the consequences of endogeneity for inferences about the effects of networks on network members' behavior. Using the House office lottery (in which newly elected members select their office spaces in a randomly chosen order) as an instrumental variable to estimate the causal impact of legislative networks on roll call behavior and cosponsorship decisions in the 105th–112th Houses, we find no evidence that office proximity affects patterns of legislative behavior. These results contrast with decades of congressional scholarship and recent empirical studies. Our analysis demonstrates the importance of accounting for selection processes and omitted variables in estimating the causal impact of networks.
To what extent did the extensive flooding caused by Hurricane Katrina affect voter participation in the 2006 mayoral election? This article uses voting record data from 20 election cycles, GIS-coded flood-depth data, and census data to examine the voting behavior of registered voters in New Orleans before and after Hurricane Katrina. We use a variety of statistical techniques, primarily propensity score matching methods, to examine how flooding affected mayoral turnout. We find that flooding decreased participation, but registered voters who experienced more than 6 ft of flooding were more likely to participate in the election than those who experienced less flooding. This finding confirms that increasing the cost of voting decreases turnout and suggests several mechanisms motivating an expressive component of voting behavior. Our results indicate there is a complex relationship between participation and the costs and benefits of turnout. Our findings about the characteristics of the voters who participated in the mayoral election provide insights into the scope of change for the political landscape of New Orleans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.