2019
DOI: 10.1111/jedm.12228
|View full text |Cite
|
Sign up to set email alerts
|

Scoring Stability in a Large‐Scale Assessment Program: A Longitudinal Analysis of Leniency/Severity Effects

Abstract: Although much attention has been given to rater effects in rater‐mediated assessment contexts, little research has examined the overall stability of leniency and severity effects over time. This study examined longitudinal scoring data collected during three consecutive administrations of a large‐scale, multi‐state summative assessment program. Multilevel models were used to assess the overall extent of rater leniency/severity during scoring and examine the extent to which leniency/severity effects were stable… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 21 publications
0
2
0
Order By: Relevance
“…rater drift , in which raters deviate from their application of a rubric over time (see Lottridge et al, 2013; Palermo et al, 2019)…”
Section: Benefits and Challenges Of Hand-scoringmentioning
confidence: 99%
See 1 more Smart Citation
“…rater drift , in which raters deviate from their application of a rubric over time (see Lottridge et al, 2013; Palermo et al, 2019)…”
Section: Benefits and Challenges Of Hand-scoringmentioning
confidence: 99%
“…However, hand-scoring threatens reliability, even with well-defined and rigorous rater-training procedures (Bridgeman, 2013;Wind & Walker, 2019). These reliability threats include: halo effect, in which a single dimension of writing quality (e.g., conventions) improperly sways a rater's judgment (see A. C. Johnson et al, 2017) rater leniency and severity, in which a human rater systematically assigns higher or lower scores than warranted (see Wind, 2018Wind, , 2020 narrowing of the scoring range (i.e., rater centrality), in which raters overly rely on the middle categories of the scoring range (see Wind, 2018Wind, , 2020 rater drift, in which raters deviate from their application of a rubric over time (see Lottridge et al, 2013;Palermo et al, 2019) One reason for rater effects is the format and subsequent interpretation of the scoring rubric (Lottridge et al, 2013). Rubrics that require raters to consider and weight many different aspects of writing quality are subject to inconsistent interpretation and application.…”
Section: Benefits and Challenges Of Hand-scoringmentioning
confidence: 99%