1994
DOI: 10.1177/014662169401800203
|View full text |Cite
|
Sign up to set email alerts
|

A Simulation Study of Methods for Assessing Differential Item Functioning in Computerized Adaptive Tests

Abstract: Simulated data were used to investigate the performance of modified versions of the Mantel-Haenszel method of differential item functioning (DIF) analysis in computerized adaptive tests (CATs). Each simulated examinee received 25 items from a 75-item pool. A three-parameter logistic item response theory (IRT) model was assumed, and examinees were matched on expected true scores based on their CAT responses and estimated item parameters. The CAT-based DIF statistics were found to be highly correlated with DIF s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
54
0

Year Published

1995
1995
2017
2017

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 43 publications
(55 citation statements)
references
References 15 publications
(17 reference statements)
1
54
0
Order By: Relevance
“…In addition they found that pretest DIF statistics were generally well behaved, but the MH DIF statistics tended to have larger standard errors for the pretest items than for the CAT items. Zwick et al (1994) addressed the effect of using alternative matching methods for pretest items. Using a more elegant matching procedure did not lead to a reduction of the MH standard errors and produced DIF measures that were nearly identical to those from the earlier study.…”
Section: Subsequent Developments With the Mantel-haenszel (Mh) Approachmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition they found that pretest DIF statistics were generally well behaved, but the MH DIF statistics tended to have larger standard errors for the pretest items than for the CAT items. Zwick et al (1994) addressed the effect of using alternative matching methods for pretest items. Using a more elegant matching procedure did not lead to a reduction of the MH standard errors and produced DIF measures that were nearly identical to those from the earlier study.…”
Section: Subsequent Developments With the Mantel-haenszel (Mh) Approachmentioning
confidence: 99%
“…Wainer (1993) provided an IRT-based effect size of amount of DIF that is based on the STAND weighting system that allows one to weight difference in the item response functions (IRF) in a manner that is proportional to the density of the ability distribution. Zwick et al (1994) and Zwick et al (1995) applied the Rasch model to data simulated according to the 3PL model. They found that the DIF statistics based on the Rasch model were highly correlated with the DIF values associated with the generated data, but that they tended to be smaller in magnitude.…”
Section: Item Response Theory (Irt)mentioning
confidence: 99%
“…In addition they found that pretest DIF statistics were generally well behaved, but the MH DIF statistics tended to have larger standard errors for the pretest items than for the CAT items. Zwick, Thayer, and Wingersky (1994) addressed the effect of using alternative matching methods for pretest items. Using a more elegant matching procedure did not lead to a reduction of the MH standard errors and produced DIF measures that were nearly identical to those from the earlier study.…”
Section: Subsequent Developments With the Mantel-haenszel (Mh) Approamentioning
confidence: 99%
“…Wainer (1993) provided an IRT-based effect size of amount of DIF that is based on the STAND weighting system that allows one to weight difference in the item response functions (IRF) in a manner that is proportional to the density of the ability distribution. Zwick et al (1994) and Zwick, Thayer, and Wingersky (1995) Thissen et al (1993). Zwick (1989Zwick ( , 1990 demonstrated that the null definition of DIF for the MH procedure (and hence STAND and other procedures employing observed scores as matching variables) and the null hypothesis based on IRT are different because the latter compares item response curves, which in essence condition on unobserved ability.…”
Section: Item Response Theory (Irt)mentioning
confidence: 99%
“…The matching items were subsets of the 75 items used in the simulation conducted by Zwick, Thayer, and Wingersky (1994). (The selection of item parameter values for this earlier study was based on analyses of actual test data.)…”
Section: Matching Itemsmentioning
confidence: 99%