The implementation of polytomous item response theory (IRT) models such as the graded response model (GRM) and the generalized partial credit model (GPCM) to inform instrument design and validation has been increasing across social and educational contexts where rating scales are usually used. The performance of such models has not been fully investigated and compared across conditions with common survey-specific characteristics such as short test length, small sample size, and data missingness. The purpose of the current simulation study is to inform the literature and guide the implementation of GRM and GPCM under these conditions. For item parameter estimations, results suggest a sample size of at least 300 and/or an instrument length of at least five items for both models. The performance of GPCM is stable across instrument lengths while that of GRM improves notably as the instrument length increases. For person parameters, GRM reveals more accurate estimates when the proportion of missing data is small, whereas GPCM is favored in the presence of a large amount of missingness. Further, it is not recommended to compare GRM and GPCM based on test information. Relative model fit indices (AIC, BIC, LL) might not be powerful when the sample size is less than 300 and the length is less than 5. Synthesis of the patterns of the results, as well as recommendations for the implementation of polytomous IRT models, are presented and discussed.
The Academic Resilience Scale (ARS) was developed to measure resilience factors in educational contexts. However, there is no clarity on whether the scale could be used as a measure of unidimensional academic resilience scores or just to obtain multidimensional academic resilience factors. How a scale is scored can affect the validity of inferences based on scores obtained by using the scale in research and practice. This study uses confirmatory factor analysis (CFA) and ancillary bifactor measures to examine the dimensionality of the scale. There was no sufficient support for using the scale to obtain unidimensional academic resilience score. Rather, the scale should only be considered as a measure of multiple dimensions of academic resilience factors.
Item parameter recovery in the compensatory multidimensional graded response model (MGRM) under simple and complex structures with rating-scale item response data was examined. A simulation study investigated factors that influence the precision of item parameter estimation, including sample size, intercorrelation between the dimensions, and test lengths for the MGRM under balanced and unbalanced complex structures, as well as the simple structure. The item responses for the MGRM were generated and analyzed across conditions using the R package mirt. The bias and root mean square error (RMSE) was used to evaluate item parameter recovery. Results suggested that item parameter estimation was more accurate in balanced complex structure conditions than in unbalanced or simple structures, especially when the test length was 40 items, and the sample size was large. Further, the mean bias and RMSE in the recovery of item threshold estimates along the two dimensions for both balanced and unbalanced complex structures were consistent across all conditions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.