Scoring Stability in a Large‐Scale Assessment Program: A Longitudinal Analysis of Leniency/Severity Effects

Palermo, Corey; Bunch, Michael B.; Ridge, Kirk

doi:10.1111/jedm.12228

Cited by 1 publication

(2 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…rater drift , in which raters deviate from their application of a rubric over time (see Lottridge et al, 2013; Palermo et al, 2019)…”

Section: Benefits and Challenges Of Hand-scoringmentioning

confidence: 99%

“…However, hand-scoring threatens reliability, even with well-defined and rigorous rater-training procedures (Bridgeman, 2013;Wind & Walker, 2019). These reliability threats include: halo effect, in which a single dimension of writing quality (e.g., conventions) improperly sways a rater's judgment (see A. C. Johnson et al, 2017) rater leniency and severity, in which a human rater systematically assigns higher or lower scores than warranted (see Wind, 2018Wind, , 2020 narrowing of the scoring range (i.e., rater centrality), in which raters overly rely on the middle categories of the scoring range (see Wind, 2018Wind, , 2020 rater drift, in which raters deviate from their application of a rubric over time (see Lottridge et al, 2013;Palermo et al, 2019) One reason for rater effects is the format and subsequent interpretation of the scoring rubric (Lottridge et al, 2013). Rubrics that require raters to consider and weight many different aspects of writing quality are subject to inconsistent interpretation and application.…”

Section: Benefits and Challenges Of Hand-scoringmentioning

confidence: 99%

See 1 more Smart Citation

Examining Human and Automated Ratings of Elementary Students’ Writing Quality: A Multivariate Generalizability Theory Application

Chen

Hebert

Wilson

2022

American Educational Research Journal

View full text Add to dashboard Cite

We used multivariate generalizability theory to examine the reliability of hand-scoring and automated essay scoring (AES) and to identify how these scoring methods could be used in conjunction to optimize writing assessment. Students ( n = 113) included subsamples of struggling writers and non-struggling writers in Grades 3–5 drawn from a larger study. Students wrote six essays across three genres. All essays were hand-scored by four raters and an AES system called Project Essay Grade (PEG). Both scoring methods were highly reliable, but PEG was more reliable for non-struggling students, while hand-scoring was more reliable for struggling students. We provide recommendations regarding ways of optimizing writing assessment and blending hand-scoring with AES.

show abstract

“…rater drift , in which raters deviate from their application of a rubric over time (see Lottridge et al, 2013; Palermo et al, 2019)…”

Section: Benefits and Challenges Of Hand-scoringmentioning

confidence: 99%

Section: Benefits and Challenges Of Hand-scoringmentioning

confidence: 99%

Examining Human and Automated Ratings of Elementary Students’ Writing Quality: A Multivariate Generalizability Theory Application

Chen

Hebert

Wilson

2022

American Educational Research Journal

View full text Add to dashboard Cite

show abstract

Scoring Stability in a Large‐Scale Assessment Program: A Longitudinal Analysis of Leniency/Severity Effects

Cited by 1 publication

References 21 publications

Examining Human and Automated Ratings of Elementary Students’ Writing Quality: A Multivariate Generalizability Theory Application

Examining Human and Automated Ratings of Elementary Students’ Writing Quality: A Multivariate Generalizability Theory Application

Contact Info

Product

Resources

About