A Comparison of the efficacies of differential item functioning detection methods

Başman, Münevver

doi:10.21449/ijate.1135368

Cited by 1 publication

(1 citation statement)

References 43 publications

(67 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Few published studies have used both (Atalay Kabasakal et al, 2014;Atar, 2007;Dainis, 2008;Erdem Keklik, 2012;Finch & French, 2007). Although MH, LR, Raju's Area Measures, and Lord's χ 2 techniques are frequently utilized in the literature, to date, there has been little comparative research conducted on Type I errors and powers of MH, LR, Raju's Area Measures, and Lord's χ 2 techniques at once (Basman, 2023;Sunbul & Omur Sunbul, 2016). In addition, since the presence of Type I error can be considered as misidentification of DIF within the scope of item bias and statistical power shows the performance of the techniques, this study aimed to investigate the results of MH, LR, Raju's Area Measures and Lord's χ 2 techniques under different conditions and to compare the techniques with each other by considering Type I error and power ratios during the comparison of the techniques.…”

Section: Introductionmentioning

confidence: 99%

Type I error and power rates: A comparative analysis of techniques in differential item functioning

BİLİCİOĞLU GÜNEŞ,

BIÇAK

2023

International Journal of Assessment Tools in Education

View full text Add to dashboard Cite

The main purpose of this study is to examine the Type I error and statistical power ratios of Differential Item Functioning (DIF) techniques based on different theories under different conditions. For this purpose, a simulation study was conducted by using Mantel-Haenszel (MH), Logistic Regression (LR), Lord’s χ2, and Raju’s Areas Measures techniques. In the simulation-based research model, the two-parameter item response model, group’s ability distribution, and DIF type were the fixed conditions while sample size (1800, 3000), rates of sample size (0.50, 1), test length (20, 80) and DIF- containing item rate (0, 0.05, 0.10) were manipulated conditions. The total number of conditions is 24 (2x2x2x3), and statistical analysis was performed in the R software. The current study found that the Type I error rates in all conditions were higher than the nominal error level. It was also demonstrated that MH had the highest error rate while Raju’s Areas Measures had the lowest error rate. Also, MH produced the highest statistical power rates. The analysis of the findings of Type 1 error and statistical power rates illustrated that techniques based on both of the theories performed better in the 1800 sample size. Furthermore, the increase in the sample size affected techniques based on CTT rather than IRT. Also, the findings demonstrated that the techniques’ Type 1 error rates were lower while their statistical power rates were higher under conditions where the test length was 80, and the sample sizes were not equal.

show abstract

Section: Introductionmentioning

confidence: 99%