2000
DOI: 10.1016/s0346-251x(99)00059-7
|View full text |Cite
|
Sign up to set email alerts
|

Rater reliability in language assessment: the bug of all bears

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0
1

Year Published

2004
2004
2020
2020

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(14 citation statements)
references
References 11 publications
0
11
0
1
Order By: Relevance
“…Essentially, the rhetorical, ideational, and language foci correspond closely to weighting patterns. In qualitative studies on weighting patterns similar to Cumming (1990), comments elicited from raters are systematically coded and frequencies of different comments are compared to infer the relevant weights attached to different scoring criteria, where a higher frequency is associated with a heavier weight (Gamaroff, 2000;Vaughan, 1991).…”
Section: Raters' Weighting Patternsmentioning
confidence: 99%
See 1 more Smart Citation
“…Essentially, the rhetorical, ideational, and language foci correspond closely to weighting patterns. In qualitative studies on weighting patterns similar to Cumming (1990), comments elicited from raters are systematically coded and frequencies of different comments are compared to infer the relevant weights attached to different scoring criteria, where a higher frequency is associated with a heavier weight (Gamaroff, 2000;Vaughan, 1991).…”
Section: Raters' Weighting Patternsmentioning
confidence: 99%
“…While early studies (e.g., Cumming, 1990;Gamaroff, 2000;Vaughan, 1991) normally treated raters as a homogeneous group, more recent studies on writing assessment started to address systematic differences between raters in weighting patterns. For example, Erdosy (2004) explored how four raters of the Test of English as a Foreign Language 4 (TOEFL) essays constructed scoring criteria without a scoring rubric, with a focus on the frequencies of two general categories: rhetorical-ideational focus and language focus.…”
Section: Rater Classificationmentioning
confidence: 99%
“…Una de las críticas más frecuentes a los estudios que intentan medir las habilidades de escritura tiene relación con la confiabilidad de las evaluaciones, pues el lenguaje es siempre objeto de interpretación. Aparte de la calidad de la escritura, múltiples fuentes de error pueden contribuir a la variabilidad de los puntajes, entre ellas, diferencias de criterio entre correctores, ambigüedad de los criterios de corrección y variaciones de las condiciones en que ésta se realiza [15][16][17] .…”
Section: N V E S T I G a C I ó Nunclassified
“…However, fluctuations in scores associated with rater factors are extensive (Huot 1990;Lumley and McNamara 1995;Weigle 1998;Gamaroff 2000;KondoBrown 2002;Amengual 2003;2004). Furthermore, raters are recognised to be one of the main sources of measurement error in assessing a candidate's performance (Milanovic et al 1996;Herrera 2001).…”
Section: Introductionmentioning
confidence: 99%
“…Since trying to reconcile raters' subjectivity with objective precision seems extremely difficult to achieve, rater reliability has been defined as the greatest bugbear in assessment (Moss 1994;Gamaroff 2000).…”
Section: Introductionmentioning
confidence: 99%