Examiners and content and site: Oh My! A national organization’s investigation of score variation in large-scale performance assessments

Sebok, Stefanie S.; Roy, Marguerite; Klinger, Don A.; Champlain, André F. De

doi:10.1007/s10459-014-9547-z

Cited by 26 publications

(26 citation statements)

References 22 publications

(34 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, adjusting for examiner variance altered the pass/fail decisions of 11% of students in their study. Other studies have suggested that examiners' scores may be biased depending on the timing of OSCEs (Hope & Cameron, 2015;McLaughlin, Ainslie, Coderre, Wright, & Violato, 2009), the performance of other candidates (Yeates, Moreau, & Eva, 2015) or by different geographical locations (Sebok, Roy, Klinger, & De Champlain, 2015). Consequently, it is not sufficient to simply conduct an OSCE, and believe that the resulting scores are a fair representation of students' performance given the known influences of construct irrelevant variance.…”

Section: Introductionmentioning

confidence: 99%

Hawks, Doves and Rasch decisions: Understanding the influence of different cycles of an OSCE on students’ scores using Many Facet Rasch Modeling

Yeates

Sebok‐Syer

2016

Medical Teacher

View full text Add to dashboard Cite

Notes on contributors:Peter Yeates is a lecturer in medical education and consultant in acute and respiratory medicine. His research focuses on assessor variability and assessor cognition within health professionals' education.Stefanie S. Sebok-Syer is a postdoctoral Fellow at the Centre for Education Research and Innovation specializing in measurement, assessment, and evaluation. Her main interests include exploring the rating behaviour of assessors, particularly in high-stakes assessment contexts.

show abstract

Section: Introductionmentioning

confidence: 99%

Hawks, Doves and Rasch decisions: Understanding the influence of different cycles of an OSCE on students’ scores using Many Facet Rasch Modeling

Yeates

Sebok‐Syer

2016

Medical Teacher

View full text Add to dashboard Cite

show abstract

“…Although examiner differences exist and can contribute to construct-irrelevant variance within an examination, their variability as a group remains relatively consistent over time. 23 Based on the generalizability study, the comparatively similar estimates of the error variances due to examiner effects between the two years led us to believe that no systematic differences in examiners existed across cohorts.…”

Section: Discussionmentioning

confidence: 98%

Does Making the Numerical Values of Verbal Anchors on a Rating Scale Available to Examiners Inflate Scores on a Long Case Examination?

et al. 2016

View full text Add to dashboard Cite

show abstract

“…Examples include use of item fit analysis to test for variability in item difficulty across schools or styles of test administration [29,30] and for evidence of unidimensionality in the scores forthcoming from a test of clinical competence. [31] Further examples include testing for rater leniency or harshness as determinants of student performance.…”

Section: Rasch Analysis In Contemporary Medical Educationmentioning

confidence: 99%

“…[31] Further examples include testing for rater leniency or harshness as determinants of student performance. [32,33] In some of the above cases, [30] …”

Section: Rasch Analysis In Contemporary Medical Educationmentioning

confidence: 99%

Fortune-tellers or content specialists: challenging the standard setting paradigm in medical education programmes

MacDougall¹,

Stone²

2015

J Contemp Med Edu

View full text Add to dashboard Cite

The veracity of Objective Standard Setting (OSS) as a modern approach to criterion-referenced standard setting has been reported for healthcare student assessment in the USA, while in other countries, OSS remains unrecognized. OSS upholds the foundational principle for itemized tests that judges should base their decisions on test item content. Moreover, it presents judges with a conceptually transparent decision procedure. This contrasts with the predictions concerning a hypothetical borderline candidate which typify Angoff procedures. Furthermore, the iterative process involved in the Angoff standard setting task incurs financial and administrative burdens, thus creating the potential to cut corners through recruiting fewer judges. The underlying objective of homogenizing the test standard undermines its validity, while circumventing reputable standard setting principles. While the Rasch model offers an objective approach to predicting successful outcomes, combining Rasch and Angoff procedures does not resolve the validity problem for Angoff-based pass marks. This commentary highlights the virtues of OSS relative to the modified Angoff approach in the standard setting of itemized tests. It also identifies gaps in the research literature that should be addressed to strengthen the case for using OSS on an international scale for high-stakes assessments within healthcare disciplines as a testing ground for other disciplines.

show abstract

Examiners and content and site: Oh My! A national organization’s investigation of score variation in large-scale performance assessments

Cited by 26 publications

References 22 publications

Hawks, Doves and Rasch decisions: Understanding the influence of different cycles of an OSCE on students’ scores using Many Facet Rasch Modeling

Hawks, Doves and Rasch decisions: Understanding the influence of different cycles of an OSCE on students’ scores using Many Facet Rasch Modeling

Does Making the Numerical Values of Verbal Anchors on a Rating Scale Available to Examiners Inflate Scores on a Long Case Examination?

Fortune-tellers or content specialists: challenging the standard setting paradigm in medical education programmes

Contact Info

Product

Resources

About