2002
DOI: 10.1177/0146621602026001001
|View full text |Cite
|
Sign up to set email alerts
|

Obtaining a Common Scale for Item Response Theory Item Parameters Using Separate Versus Concurrent Estimation in the Common-Item Equating Design

Abstract: Item response theory item parameters can be estimated using data from a common-item equating design either separately for each form or concurrently across forms. This paper reports the results of a simulation study of separate versus concurrent item parameter estimation. Using simulated data from a test with 60 dichotomous items, four factors were considered: (a) estimation program (MULTILOG versus BILOG-MG), (b) sample size per form (3,000 versus 1,000), (c) number of common items (20 versus 10), and (d) equi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

26
197
2
7

Year Published

2006
2006
2022
2022

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 166 publications
(238 citation statements)
references
References 7 publications
26
197
2
7
Order By: Relevance
“…Characteristic curve methods generally performed better than moment methods with regard to preserve both equity properties. This result is consistent with what has been found in past studies which compared characteristic curve methods with moment methods for the dichotomous IRT models (Hanson & Beguin, 2002;Kim & Cohen, 1992;Ogasawara, 2002) and mixture IRT models (Kim, 2004;Kim & Lee, 2004;Kim & Lee, 2006). The characteristic curve methods require an iterative multivariate search procedure but moment methods require simple summary statistics (Kim, 2004).…”
Section: Discussionsupporting
confidence: 89%
See 1 more Smart Citation
“…Characteristic curve methods generally performed better than moment methods with regard to preserve both equity properties. This result is consistent with what has been found in past studies which compared characteristic curve methods with moment methods for the dichotomous IRT models (Hanson & Beguin, 2002;Kim & Cohen, 1992;Ogasawara, 2002) and mixture IRT models (Kim, 2004;Kim & Lee, 2004;Kim & Lee, 2006). The characteristic curve methods require an iterative multivariate search procedure but moment methods require simple summary statistics (Kim, 2004).…”
Section: Discussionsupporting
confidence: 89%
“…It is known that characteristic curve scale linking methods produce more accurate results than the moment methods (Baker & Al-Karni, 1991;Hanson & Beguin, 2002;Kim & Cohen, 1992;Kim & Kolen, 2006;Kim & Lee, 2004;Ogasawara, 2001). Could we generalize this evident to mixed-format test equating?…”
Section: The Purpose and Significance Of The Studymentioning
confidence: 99%
“…This case could be interpreted in the following way: the tests were efficient to differentiate those at mean levels, but they became distanced from equation when the number of errors increased down-line. The research findings were found to be parallel to the ones in the literature (Cohen and Kim, 1998;Hanson and Beguin, 2002;Kim and Kolen, 2006;Hung, Wu and Chen, 1991;Way and Tang, 1991;Karkee and Wright, 2004;Kaskowitz and De Ayala, 2001;Kim and Lee, 2004;Kim and Lee, 2006;Kim and Kolen, 2004;Kim and Song, 2004). The research results have shown that the lowest equation error could be obtained from characteristic curve methods.…”
Section: Findings Discussion and Resultssupporting
confidence: 77%
“…En esta segunda muestra hubo un porcentaje mejor nivelado de hombres y mujeres (47% y 53%), una media y mediana de edad de 21.32 y 20. Las razones principales que llevaron a realizar el estudio con la segunda muestra fueron: 1) afinidad de los alumnos a los conceptos visuales y espaciales del TAF dado el contenido de las carreras, 2) nivelación de los porcentajes según el género y 3) superar las limitaciones del primer estudio realizado con la muestra de Psicología, a saber, a) el problema de la velocidad interviniendo en las respuestas, ya que en el primer estudio la correlación entre el tiempo total y el puntaje total resultó significativa al 1%, y b) el tamaño de la muestra acorde para un Modelo Logístico de Tres Parámetros (ML3P), ya que se recomienda que n > 1000 (Hanson & Beguin, 2002;Yen, 1987).…”
Section: Método Participantesunclassified