IRT Approaches to Modeling Scores on Mixed‐Format Tests

Lee, Won‐Chan; Kim, Stella Y.; Choi, Ji-Won; Kang, Yujin

doi:10.1111/jedm.12248

Cited by 8 publications

(9 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The findings showed that the bi-factor MIRT model offered a better solution for Grade 1 in terms of model fit as indicated by the lower −2LL value, and model parsimony as indicated by the lower AIC, AICc, BIC, and SABIC values. This result is consistent with previous bi-factor MIRT applications in language assessment (Cai & Kunnan, 2018; Lee et al, 2019; Min & He, 2014), which also reported relatively better model–data fit of the bi-factor MIRT model over unidimensional IRT models and/or multidimensional generalizations of unidimensional IRT models. However, for Grades 2–12, the fit indices and model parsimony indicators tended to lead to different conclusions.…”

Section: Discussionsupporting

confidence: 92%

Reading is a multidimensional construct at child-L2-English-literacy onset, but comprises fewer dimensions over time: Evidence from multidimensional IRT analysis

2021

View full text Add to dashboard Cite

This study explored the interplay between content knowledge and reading ability in a large-scale multistage adaptive English for academic purposes (EAP) reading assessment at a range of ability levels across 1–12 graders. The datasets for this study were item-level responses to the reading tests of ACCESS for ELLs Online 2.0. A sample of 10,000 test takers were each time randomly drawn from the test-taking population at five grade clusters without manipulation on proficiency levels, and then with manipulation on proficiency levels. The results indicated that although the bi-factor multidimensional item response theory (MIRT) model fit the data significantly better than the unidimensional two-parameter logistic (2PL) model for Grade 1, no clear evidence can be found regarding the dimensionality of the test for Grades 2–12. However, content knowledge was consistently found to contribute substantially to test performance for low-ability-level test takers across all grade clusters. The findings indicate that EAP reading ability is a multidimensional construct in the onset of EAP reading ability development, but the presence of multidimensionality decreases as proficiency level and grade level increase. This study provides insights into the developmental pattern of the interplay between language and content in EAP reading contexts.

show abstract

Section: Discussionsupporting

confidence: 92%

Reading is a multidimensional construct at child-L2-English-literacy onset, but comprises fewer dimensions over time: Evidence from multidimensional IRT analysis

2021

View full text Add to dashboard Cite

show abstract

“…First, as the proposed methods were developed and applied for dichotomous items only, future research would involve applying them to cases with polytomous or mixed‐format items. These methods could also be extended to more complex IRT models such as multidimensional models (e.g., Kim et al., 2020; Lee et al., 2020). Second, a future simulation study would involve study factors that were not considered in the present study, such as extremely small samples (e.g., Kolen, 2020; Peabody, 2020), vertical scaling, and so on.…”

Section: Summary and Discussionmentioning

confidence: 99%

Two IRT Characteristic Curve Linking Methods Weighted by Information

Wang

Zhang

Lee

et al. 2022

J Educational Measurement

Self Cite

View full text Add to dashboard Cite

Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item-(IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting components in the loss function from traditional methods by their corresponding item and test information, respectively. Monte Carlo simulation was conducted to evaluate the performances of the new linking methods and compare them with traditional ones. Ability difference between linking groups, sample size, and test length were manipulated under the common-item nonequivalent groups design. Results showed that the two information-weighted characteristic curve methods outperformed traditional methods, in general. TWCC was found to be more accurate and stable than IWCC. A pseudo-form pseudo-group analysis was also performed, and similar results were observed. Finally, guidelines for practice and future directions are discussed.

show abstract

“…Unlike UNT, true‐score equating under the SS‐MIRT framework also requires a multivariate true‐score distribution (see Equation ). Three potential methods appeared in the literature (Lee et al., 2020) to approximate the ability distribution of θ : (a) a quadrature distribution (D‐method), (b) Monte‐Carlo simulation (M‐method), and (c) individual latent‐trait estimates

\widehat{\bm{\theta}}

(P‐method). In Lee et al.…”

Section: Real Data Illustrationmentioning

confidence: 99%

“…In Lee et al. (2020), these three methods were discussed as a means to analyze psychometrics properties using MIRT models. However, they have not been used or introduced in the equating literature yet.…”

Section: Real Data Illustrationmentioning

confidence: 99%

Several Variations of Simple‐Structure MIRT Equating

Kim

Lee

2022

J Educational Measurement

Self Cite

View full text Add to dashboard Cite

The current study proposed several variants of simple‐structure multidimensional item response theory equating procedures. Four distinct sets of data were used to demonstrate feasibility of proposed equating methods for two different equating designs: a random groups design and a common‐item nonequivalent groups design. Findings indicated some notable differences between the multidimensional and unidimensional approaches when data exhibited evidence for multidimensionality. In addition, some of the proposed methods were successful in providing equating results for both section‐level and composite‐level scores, which has not been achieved by most of the existing methodologies. The traditional method of using a set of quadrature points and weights for equating turned out to be computationally intensive, particularly for the data with higher dimensions. The study suggested an alternative way of using the Monte‐Carlo approach for such data. This study also proposed a simple‐structure true‐score equating procedure that does not rely on a multivariate observed‐score distribution.

show abstract

IRT Approaches to Modeling Scores on Mixed‐Format Tests

Cited by 8 publications

References 43 publications

Reading is a multidimensional construct at child-L2-English-literacy onset, but comprises fewer dimensions over time: Evidence from multidimensional IRT analysis

Reading is a multidimensional construct at child-L2-English-literacy onset, but comprises fewer dimensions over time: Evidence from multidimensional IRT analysis

Two IRT Characteristic Curve Linking Methods Weighted by Information

Several Variations of Simple‐Structure MIRT Equating

Contact Info

Product

Resources

About