New goodness-of-fit indices are introduced for dichotomous item response theory (IRT) models. These indices are based on the likelihoods of number-correct scores derived from the IRT model, and they provide a direct comparison of the modeled and observed frequencies for correct and incorrect responses for each number-correct score. The behavior of Pearson's χ 2 (S-X 2 ) and the likelihood ratio G 2 (S-G 2 ) was assessed in a simulation study and compared with two fit indices similar to those currently in use (Q 1 -X 2 and Q 1 -G 2 ). The simulations included three conditions in which the simulating and fitting models were identical and three conditions involving model misspecification. S-X 2 performed well, with Type I error rates close to the expected .05 and .01 levels. Performance of this index improved with increased test length. S-G 2 tended to reject the null hypothesis too often, as did Q 1 -X 2 and Q 1 -G 2 . The power of S-X 2 appeared to be similar for all test lengths, but varied depending on the type of model misspecification. Index terms: chi-square distribution, dichotomous items, goodness-of-fit, item fit, item response theory (item fit), likelihood ratio statistic, Pearson statistic.Item response theory (IRT) is a collection of modeling techniques for the analysis of items, tests, and persons. An IRT model for dichotomous item responses generally specifies that the probability of response pattern x iswhere x is the response vector, T i (θ) is the probability of a correct response on item i as a function of the trait θ, and φ(θ) is the population distribution for θ. The three-parameter logistic model (3PLM),where a i is the slope parameter, b i is the location parameter, and c i is the lower asymptote parameter, is commonly used for multiple-choice items (see Lord, 1980). In some situations, the twoparameter logistic model (2PLM) is used. This model is equivalent to the 3PLM with c i = 0. The one-parameter logistic model (1PLM) additionally restricts a i = a for all items. IRT applications are based on estimating model parameters (a i , b i , and c i ). These estimates are usually obtained by maximizing the likelihood50
Williams, Jones, and Tukey (1999) showed that a sequential approach to controlling the false discovery rate in multiple comparisons, due to Benjamini and Hochberg (1995), yields much greater power than the widely used Bonferroni technique that limits the familywise Type I error rate. The Benjamini-Hochberg (B-H) procedure has since been adopted for use in reporting results from the National Assessment of Educational Progress (NAEP), as well as in other research applications. This short note illustrates that the B-H procedure is extremely simple to implement using widely available spreadsheet software. Given its easy implementation, it is feasible to include the B-H procedure in introductory instruction in inferential statistics, augmenting or replacing the Bonferroni technique.
Statistical methods designed for categorical data were used to perform confirmatory factor analyses and item response theory (IRT) analyses of the Fear of Negative Evaluation scale (FNE; D. Watson & R. Friend, 1969) and the Brief FNE (BFNE; M. R. Leary, 1983). Results suggested that a 2-factor model fit the data better for both the FNE and the BFNE, although the evidence was less strong for the FNE. The IRT analyses indicated that although both measures had items with good discrimination, the FNE items discriminated only at lower levels of the underlying construct, whereas the BFNE items discriminated across a wider range. Convergent validity analyses indicated that the straightforwardly-worded items on each scale had significantly stronger relationships with theoretically related measures than did the reverse-worded items. On the basis of all analyses, usage of the straightforwardly-worded BFNE factor is recommended for the assessment of fear of negative evaluation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.