When developing new headache treatments or when discovering relationships among important variables, it is often necessary to infer characteristics about a large population from a sample of observed data. For example, when testing a new headache treatment vs placebo in a sample of 100 individuals, do the observed differences in the sample provide high confidence that the treatment will also work in the population? Because samples contain random sampling error, researchers require a set of methods to help decide if any observed treatment difference, or observed relationship, is simply due to chance. Statistical inference refers to those methods that allow the estimation of population properties (eg, a treatment effect) from observed samples. Very often, this process takes the form of formal hypothesis testing. Although there are many ways a researcher could investigate a hypothesis, in medical research, by far the most common is through the use of some form of statistical hypothesis testing.Statistical hypothesis testing is a set of methods for statistical inference that has a fascinating and contentious history (see: Lenhard 1 ). A famous debate raged for decades between the early creators of these methods about the proper application of the emerging technique that would eventually become the most popular tool for statistical inference. 2 The methods most commonly used today are a blend between the "significance test" developed by Fisher 3 and the "hypothesis test" developed by Neyman and Pearson. 4 Although modern application of statistical hypothesis testing has evolved over time, perhaps tending toward the approaches advocated by Neyman and Pearson, 2 a thorough understanding of the principles of significance-based statistical hypothesis testing is crucial for investigators, consumers of research, and even for the growing number of individuals who wish to abandon the use of any hypothesis testing based on these principles. 5 This editorial is the next in the Journal's methods and statistics primer series. [6][7][8][9][10][11] In this installment, we introduce the concept of significance-based statistical hypothesis testing and the use of this form of statistical inference in headache research. We also describe common problems encountered when applying and reporting findings related to this form of hypothesis testing.
DEFINING THE ISSUEMost people who have read a headache research article have seen the signs that significance-based hypothesis testing has been conducted. The use of P values (eg, P < .05), the term "statistically significant," and the array of statistical tests (eg, ANOVA, t-tests) all convey that investigators are testing a hypothesis to make an inference about some population. To understand what these terms indicate, it is important to grasp that all such hypotheses are attempts to refute, rather than prove, something. 12 This reasoning is at first counterintuitive but becomes clearer when the idea of the null hypothesis is fully understood.The Null Hypothesis (H 0 ).-Significance-based hypothe...