Higher education institutions globally face decreasing government funding and heightened accountability. There is an increasing demand for them to justify the use of public funding for teaching in universities, and provide information for national and international comparisons of quality. High-stakes summative student assessment data have been, and continue to be, critical for fulfilling both accountability and quality requirements. However, concerns have been raised about the dominance of using summative assessment for multiple purposes beyond certifying student achievement. For example, this dominance marginalises the importance of formative assessment in providing feedback to guide students' future learning. Such concerns have driven assessment reforms. One of these reforms-competency-based assessment (CBA)-is promoted as an alternative to high-stakes examinations, as it may achieve both summative and formative purposes. In the context of medical education, an example of CBA is the Objective Structured Clinical Examination (OSCE) which has been implemented globally for both summative and formative purposes. The OSCE was originally designed for a small cohort of students (around 100) to undertake highly structured and discrete clinical tasks in a series of timed stations. A major challenge of the OSCE relates to achieving consistency of examiner judgements of student performance as the assessment format relies on multiple examiners. In this PhD study, the OSCE under investigation was a high-stakes exit examination for large cohorts of final-year students (over 350) enrolled in a graduate entry four-year Bachelor of Medicine/Bachelor of Surgery (MBBS) program at one Australian research-intensive university. A question arises as to whether examiners in this context can deliver consistent and reliable judgements. This is the issue of concern addressed in this thesis. The overarching purpose of this study was to provide new insights into the consistency of examiner judgements in this high-stakes assessment, and explore the possible impact of structured feedback on changing examiner marking behaviour. The four specific aims were to: develop a deeper understanding of the associated factors that influence examiner judgements of medical students' performance in clinical examinations (OSCEs); evaluate the impacts of providing examiners with structured feedback on their subsequent judgement behaviours; explore the factors that impact on the effectiveness of structured feedback in changing examiner marking behaviour; and, to explore the examiners' proposed training strategies that may assist in increasing the consistency of their judgements in OSCEs. A mixed-methods case study approach was adopted to collect both quantitative and qualitative data. Quantitative data included the examiners' scores awarded to the students in the Year 1 and 2 OSCEs. After completing the OSCE in Year 1, the examiners received a structured Consistency of Examiner Judgements in Medical Education 3 feedback report about their marking behaviour pr...