“…However, hand-scoring threatens reliability, even with well-defined and rigorous rater-training procedures (Bridgeman, 2013;Wind & Walker, 2019). These reliability threats include: halo effect, in which a single dimension of writing quality (e.g., conventions) improperly sways a rater's judgment (see A. C. Johnson et al, 2017) rater leniency and severity, in which a human rater systematically assigns higher or lower scores than warranted (see Wind, 2018Wind, , 2020 narrowing of the scoring range (i.e., rater centrality), in which raters overly rely on the middle categories of the scoring range (see Wind, 2018Wind, , 2020 rater drift, in which raters deviate from their application of a rubric over time (see Lottridge et al, 2013;Palermo et al, 2019) One reason for rater effects is the format and subsequent interpretation of the scoring rubric (Lottridge et al, 2013). Rubrics that require raters to consider and weight many different aspects of writing quality are subject to inconsistent interpretation and application.…”