Increasingly, academic institutions are being required to improve the validity of the assessment process; unfortunately, often this is at the expense of reliability. In medical schools (such as Leeds), standardized tests of clinical skills, such as Objective Structured Clinical Examinations (OSCEs) are widely used to assess clinical competence, both at the undergraduate and postgraduate levels. However, the issue of setting the pass marks or passing standard for such examinations remains contentious. The arrangements for particular OSCE assessment activities usually involve many different assessors, and have practical aspects that cannot be exactly duplicated across the cohort. These complexities therefore raise issues with respect to the robustness of the comparative student grading mechanism. This article addresses one aspect affecting the reliability of assessment, namely the effects of assessor training on the awarding of student marks. The article also investigates the interaction of gender between assessors, trained and untrained, and students. The findings, which are based on a detailed analysis of final year OSCE marks, indicate that untrained assessors award higher marks than trained assessors, and that a gender interaction exists; more specifically, that the use of untrained assessors tends to benefit female students over male students. The tension between reliability and validity has been particularly important in the field of medical education for a number of years -with medical students in the latter stages of their courses often being required to demonstrate competence in a variety of different simulated clinical activities with different patients, in front of different assessors in different hospitals and on different days. The complex nature of the OSCE arrangements raises serious questions as to the robustness of the setting the pass/fail boundary and, to a lesser extent, the honours boundaries.
Overview
What is already known on this subjectThere are differing opinions as to the effects of assessor training on the quality of judgements in the area of criterion-based assessment; either that it makes no overall difference to the degree of severity in the marking, or that it leads to increasing levels of stringency.
What this study addsThis is a large study based on 207 students and 108 assessors. It quantifies the overall effects of assessor training with reference to the potential impact on student pass rates; it