Fit of the model to the data is important if the benefits of item response theory (IRT) are to be obtained. In this study, the authors compared model selection results using the likelihood ratio test, two information-based criteria, and two Bayesian methods. An example illustrated the potential for inconsistency in model selection depending on which of the indices was used. Results from a simulation study indicated that the inconsistencies among the indices were common but that model selection was relatively accurate for longer tests administered to larger sample of examinees. The cross-validation log-likelihood (CVLL) appeared to work the best of the five models for the conditions simulated in this study.Item response theory (IRT) consists of a family of mathematical models designed to describe the performance of examinees on test items. Selection of an appropriate IRT model given a data set is based, in part, on model-data fit, which is necessary if the benefits of IRT are to be obtained, and also on the degree of model complexity. A more complicated model than appropriate violates the principle of parsimony. Essentially, one should select the simplest of models that still explains the data well. A model chosen according to this principle will have less chance of introducing inconsistencies, ambiguities, and redundancies. The objective in model selection is to choose a model that not only provides sound fit to the data but also has the ability to generalize to predictions of future or different data.When unidimensional IRT models are nested, selection can be done using a likelihood ratio (LR) test to compare the relative fit of the models. The LR test statistic, G 2 , is a chi-square-based statistic and is calculated as the difference of two deviances from two compared models. The difference for two models is itself distributed as a chi-square and so can be subjected to significance tests to determine which model is the better fit (Anderson, 1973;Baker & Kim, 2004;Bock & Aitkin, 1981).When IRT models are not nested, an alternative approach is to investigate model selection using information-based statistics such as Akaike's information criterion (AIC; Akaike, 1974) or Schwarz's Bayesian information criterion (BIC;Schwarz, 1978). Although significance tests are not possible with these statistics, they do provide estimates of the relative differences between solutions. These statistics are appropriate when maximum likelihood estimates of model parameters are obtained. As Lin and Dayton (1997), Lord (1975), and Sahu (2002 note, however, asymptotic estimates of item parameters may not always be available, in which case neither AIC nor BIC is appropriate. For such situations, Bayesian parameter estimation can sometimes be an effective alternative. Bayesian estimates of model parameters are often obtained when using Markov chain Monte Carlo (MCMC)
The utility of Orlando and Thissen's (2000, 2003) S-X 2 fit index was extended to the model-fit analysis of the graded response model (GRM). The performance of a modified S-X 2 in assessing item-fit of the GRM was investigated in light of empirical Type I error rates and power with a simulation study having various conditions typically encountered in applied testing situations. The results show that the Type I error rates were controlled adequately around the nominal alpha by S-X 2. The power of the S-X 2 statistic was much lower when the source of misfit was multidimensionality than when it was due to discrepancy from the true GRM curves. Once the data size increased sufficiently, however, appropriate power was obtained regardless of the source of the item-misfit. In summary, the generalized S-X 2 appears to be a promising index for investigating item fit for polytomous items in educational and psychological assessments.
Orlando and Thissen's S‐X 2 item fit index has performed better than traditional item fit statistics such as Yen's Q1 and McKinley and Mill's G2 for dichotomous item response theory (IRT) models. This study extends the utility of S‐X 2 to polytomous IRT models, including the generalized partial credit model, partial credit model, and rating scale model. The performance of the generalized S‐X 2 in assessing item model fit was studied in terms of empirical Type I error rates and power and compared to G2. The results suggest that the generalized S‐X 2 is promising for polytomous items in educational and psychological testing programs.
This paper compares three methods of item calibration-concurrent calibration, separate calibration with linking, and fixed item parameter calibration-that are frequently used for linking item parameters to a base scale. Concurrent and separate calibrations were implemented using BILOG-MG. The Stocking and Lord (1983) characteristic curve method of parameter linking was used in conjunction with separate calibration. The fixed item parameter calibration (FIPC) method was implemented using both BILOG-MG and PARSCALE because the method is carried out differently by the two programs. Both programs use multiple EM cycles but BILOG-MG does not update the prior ability distribution during FIPC calibration whereas PARSCALE updates the prior ability distribution multiple times. The methods were compared using simulations based on actual testing program data and results were evaluated in terms of recovery of the underlying ability distributions, the item characteristic curves, and the test characteristic curves. Factors manipulated in the simulations were sample size, ability distributions, and numbers of common (or fixed) items. The results for concurrent calibration and separate calibration with linking were comparable and both methods showed good recovery results for all conditions. Between the two fixed item parameter calibration procedures, only the appropriate use of PARSCALE consistently provided item parameter linking results similar to those of the other two methods.
This study examines the utility of four indices for use in model selection with nested and nonnested polytomous item response theory (IRT) models: a cross-validation index and three information-based indices. Four commonly used polytomous IRT models are considered: the graded response model, the generalized partial credit model, the partial credit model, and the rating scale model. In a simulation study, comparisons among the four indices suggest that model selection is dependent to some extent on the particular conditions simulated. Overall, the Bayesian information criterion index appears to be most accurate in selecting the correct polytomous IRT model. Results are presented from analysis of a real data set to illustrate the use of the four indices for selecting an appropriate model.
We analyzed the population genetic structure and demographic history of 20 Lymantria dispar populations from Far East Asia using microsatellite loci and mitochondrial genes. In the microsatellite analysis, the genetic distances based on pairwise F ST values ranged from 0.0087 to 0.1171. A NeighborNet network based on pairwise F ST genetic distances showed that the 20 regional populations were divided into five groups. Bayesian clustering analysis (K = 3) demonstrated the same groupings. The populations in the Korean Peninsula and adjacent regions, in particular, showed a mixed genetic pattern. In the mitochondrial genetic analysis based on 98 haplotypes, the median‐joining network exhibited a star shape that was focused on three high‐frequency haplotypes (Haplotype 1: central Korea and adjacent regions, Group 1; Haplotype 37: southern Korea, Group 2; and Haplotype 90: Hokkaido area, Group 3) connected by low‐frequency haplotypes. The mismatch distribution dividing the three groups was unimodal. In the neutral test, Tajima's D and Fu's FS tests were negative. We can thus infer that the Far East Asian populations of L. dispar underwent a sudden population expansion. Based on the age expansion parameter, the expansion time was inferred to be approximately 53,652 years before present (ybp) for Group 1, approximately 65,043 ybp for Group 2, and approximately 76,086 ybp for Group 3. We propose that the mixed genetic pattern of the inland populations of Far East Asia is due to these expansions and that the inland populations of the region should be treated as valid subspecies that are distinguishable from other subspecies by genetic traits.
DNA barcoding and morphological analyses of Korean Lymantria (Erebidae, Lepidoptera) were conducted for quarantine inspection. In DNA barcoding, Lymantria dispar identified through quarantine inspection was distinguished as three species, L. dispar asiatica, L. albescens, and L. xylina. Lymantria monacha, which is known as a single species in Korea, is revealed as containing three species, L. monacha, L. minomonis, and L. sugii. At the subspecies level, L. dispar dispar formed a single cluster, whereas L. d. asiatica and L. d. japonica formed a cluster containing both subspecies. In morphological re-examination on DNA barcoding results, L. dispar was distinguished from L. albescens by wing pattern, and from L. xylina by papillae anale. L. monacha and the related species were hard to be distinct from each other by using wing pattern, but it was easily distinct through comparison of genitalia. Therefore, DNA barcoding led to accurate identification in species level, but in subspecies level, only a taxon showing geographically far distance was discriminated from the others. These results may provide a taxonomic outline of the Korean Lymantria fauna and may be used as an identification reference for Lymantria species during quarantine inspection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.