Although each of the perspectives discussed in this paper advances our understanding of assessor cognition and its impact on WBA, every perspective has its limitations. Following a discussion of areas of concordance and discordance across the perspectives, we propose a coexistent view in which researchers and practitioners utilise aspects of all three perspectives with the goal of advancing assessment quality and ultimately improving patient care.
Assessors' scores in performance assessments are known to be highly variable. Attempted improvements through training or rating format have achieved minimal gains. The mechanisms that contribute to variability in assessors' scoring remain unclear. This study investigated these mechanisms. We used a qualitative approach to study assessors' judgements whilst they observed common simulated videoed performances of junior doctors obtaining clinical histories. Assessors commented concurrently and retrospectively on performances, provided scores and follow-up interviews. Data were analysed using principles of grounded theory. We developed three themes that help to explain how variability arises: Differential Salience-assessors paid attention to (or valued) different aspects of the performances to different degrees; Criterion Uncertainty-assessors' criteria were differently constructed, uncertain, and were influenced by recent exemplars; Information Integration-assessors described the valence of their comments in their own unique narrative terms, usually forming global impressions. Our results (whilst not precluding the operation of established biases) describe mechanisms by which assessors' judgements become meaningfully-different or unique. Our results have theoretical relevance to understanding the formative educational messages that performance assessments provide. They give insight relevant to assessor training, assessors' ability to be observationally "objective" and to the educational value of narrative comments (in contrast to numerical ratings).
CONTEXT A recent study has suggested that assessors judge performance comparatively rather than against fixed standards. Ratings assigned to borderline trainees were found to be biased by previously seen candidates' performances. We extended that programme of investigation by examining these effects across a range of performance levels. Furthermore, we investigated whether confidence in the rating assigned predicts susceptibility to manipulation and whether prompting consideration of typical performance lessens the influence of recent experience. DISCUSSION Assessors' judgements showed contrast effects at both good and borderline performance levels. Findings suggest that assessors use normative rather than criterion-referenced decision making while judging, and that the norms referenced are weakly represented in memory and easily influenced. Confidence ratings suggested a lack of insight into this phenomenon. Raters' judgements could be importantly influenced in ways that are unfair to candidates.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.