2011
DOI: 10.1080/08957347.2011.607063
|View full text |Cite
|
Sign up to set email alerts
|

Do Different Approaches to Examining Construct Comparability in Multilanguage Assessments Lead to Similar Conclusions?

Abstract: In this study, we examine the degree of construct comparability and possible sources of incomparability of the English and French versions of the Programme for International Student Assessment (PISA) 2003 problem-solving measure administered in Canada. Several approaches were used to examine construct comparability at the test-(examination of test data structure, reliability comparisons and test characteristic curves) and item-levels (differential item functioning, item parameter correlations, and linguistic c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

2
21
0
1

Year Published

2013
2013
2024
2024

Publication Types

Select...
7
1

Relationship

4
4

Authors

Journals

citations
Cited by 36 publications
(24 citation statements)
references
References 27 publications
2
21
0
1
Order By: Relevance
“…It might also contribute to challenges in identifying sources of DIF documented in previous research (Ercikan et al, 2010;Oliveri & Ercikan, 2011). Multiple studies have been conducted with gender, ethnic, and language comparison groups to improve DIF detection and the identification of its sources by creating more homogeneous groups in measurement comparability research.…”
mentioning
confidence: 95%
See 1 more Smart Citation
“…It might also contribute to challenges in identifying sources of DIF documented in previous research (Ercikan et al, 2010;Oliveri & Ercikan, 2011). Multiple studies have been conducted with gender, ethnic, and language comparison groups to improve DIF detection and the identification of its sources by creating more homogeneous groups in measurement comparability research.…”
mentioning
confidence: 95%
“…The identification of differential item functioning (DIF) may signal that an item is measuring construct-irrelevant factors such as differential familiarity with item types, formats, or vocabulary knowledge for one or more of the comparison groups (Ercikan, Gierl, McCreith, Puhan, & Koh, 2004;Ercikan & Lyons-Thomas, 2013;Oliveri & Ercikan, 2011). Accurate DIF detection is central to making claims regarding whether an item should be used in an assessment or whether modification is required in order to reduce or eliminate construct-irrelevant variance across comparison groups.…”
mentioning
confidence: 98%
“…Items are typically flagged for DIF if response probabilities for examinees at the same ability levels depend on group membership. As different methods for identifying DIF may not give identical results, the use of more than one method is recommended, to allow for the corroboration of DIF status for the items analyzed (Ercikan and McCreith, 2002;Oliveri and Ercikan, 2011). In this research, an IRT-based approach and logistic/ordinal logistic regression approaches were used.…”
Section: Differential Item Functioning Analysismentioning
confidence: 99%
“…It is incumbent on countries with multiple official languages to take reasonable steps to ensure that linguistic groups are given the opportunity to perceive and respond to tests in the same way (Fairbairn and Fox, 2009;Rogers et al, 2010;Marotta et al, 2015). Yet, research conducted in Canada comparing the French and English versions of LSAs has found that 18-60% of items function differentially for the two groups (Gierl et al, 1999;Gierl, 2000;Ercikan and McCreith, 2002;Ercikan et al, 2004b;Oliveri and Ercikan, 2011;Marotta et al, 2015).…”
mentioning
confidence: 99%
“…To illustrate, reading competencies may develop at different speeds and in different ways across languages with differing alphabets (Spielberger, Moscoso, & Brunner, 2005). Additional factors that might lead to the development of tests with limited comparability include natural variation in difficulty, commonality, or contextual meaning of vocabulary and differential sentence length or complexity (Hambleton, Merenda, & Spielberger, 2005;Oliveri & Ercikan, 2011;Solano-Flores, Backhoff, & ContreraNiño, 2009). Several guidelines have been developed to address comparability issues arising in test adaptation.…”
mentioning
confidence: 98%