2012
DOI: 10.1080/15305058.2011.617475
|View full text |Cite
|
Sign up to set email alerts
|

Methodologies for Investigating Item- and Test-Level Measurement Equivalence in International Large-Scale Assessments

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
19
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 25 publications
(22 citation statements)
references
References 24 publications
2
19
0
Order By: Relevance
“…Previous research demonstrated considerable measurement incomparability between countries in international assessments (Ercikan, Roth, & Asil, in press;Kankaraš & Moores, 2013;Oliveri, Ercikan, & Zumbo, 2013;Oliveri et al, 2012). This incomparability existed even between countries administering tests in the same language (Ercikan & McCreith, 2002;Ercikan et al, in press;Roth et al, 2013) and between language groups within countries (Ercikan et al, 2014;Kankaraš & Moores, 2013;Oliveri et al, 2012).…”
Section: Differential Item Functioning Analysesmentioning
confidence: 89%
See 1 more Smart Citation
“…Previous research demonstrated considerable measurement incomparability between countries in international assessments (Ercikan, Roth, & Asil, in press;Kankaraš & Moores, 2013;Oliveri, Ercikan, & Zumbo, 2013;Oliveri et al, 2012). This incomparability existed even between countries administering tests in the same language (Ercikan & McCreith, 2002;Ercikan et al, in press;Roth et al, 2013) and between language groups within countries (Ercikan et al, 2014;Kankaraš & Moores, 2013;Oliveri et al, 2012).…”
Section: Differential Item Functioning Analysesmentioning
confidence: 89%
“…This incomparability existed even between countries administering tests in the same language (Ercikan & McCreith, 2002;Ercikan et al, in press;Roth et al, 2013) and between language groups within countries (Ercikan et al, 2014;Kankaraš & Moores, 2013;Oliveri et al, 2012). It is important to identify whether item scores are comparable across groups since, if item scores are not comparable, the creation of a single scale score intended to represent all groups is not appropriate.…”
Section: Differential Item Functioning Analysesmentioning
confidence: 93%
“…Both techniques require large samples, but even larger samples are generally needed for IRT analyses. It was mentioned earlier that IRT methods can estimate both DIF at the item level and DTF at the level of cumulative responses over items (e.g., Oliveri et al, 2012). I am aware of no similar capability in MGCFA.…”
Section: Structural Equation Modelling Techniques For Analysing Test mentioning
confidence: 97%
“…There are also IRT-based methods that estimate the degree of differential test functioning (DTF), or whether total scores based on cumulative responses over a set of items relate to the underlying (latent) dimension in the same over different populations. Oliveri, Olson, Ercikan, and Zumbo (2012) described the application of parametric and nonparametric forms of IRT and the technique of ordinal logistic regression to simultaneously estimate DIF and DTF over samples of English-versus French-speaking students who completed an objective measure of problem solving. They found that differential functioning at the item level was not generally detected by analysis at the test level.…”
Section: Standard Statistical Techniques For Analysing Test Biasmentioning
confidence: 99%
“…The psychometric property that typically must hold for scores to be comparable is known as measurement invariance (Meredith 1993), absence of differential item functioning (Hambleton et al 1991;Mellenbergh 1994;Swaminathan & Rogers 1990), or lack of bias (Lord 1980). Regardless of the term used, the literature on scale score equivalence in large-scale achievement tests has received considerable attention (e.g., Ercikan 2002;Hambleton 2002;Oliveri et al 2012). Many of these investigations have focused on pairwise comparisons of countries Oliveri 2012), the latter of which uses both empirical and simulated data.…”
Section: Introductionmentioning
confidence: 99%