Background:Conceptualising the Borderline candidate is one of the most difficult tasks in standard setting. However, it is also central to the process. Here we describe a methodology by which the score of Borderline candidates can be retrospectively calculated from the Facility (the percentage of items answered correctly) of assessment items for the cohort as a whole.
Methods:We previously explored performance of candidates within an academic year in one UK medical school, covering 26 separate assessments. Each assessment had previously been standard set by either Angoff or Borderline Regression methods. In this study, we identified Borderline candidates by reviewing their performance within a particular test, not part of the previously published material. A student was classed as 'Borderline' if they were within 1 Standard Error of Measurement above or below the pass cut score. We plotted the item scores of the Borderline candidates as calculated by this method in comparison with Facility for the whole cohort and fitted a curve to the resulting distribution. In this paper, a simple method of repeating this process is described for any cohort of students.
Results:For an ideal cohort of candidates, Borderline candidate scores should intercept the self-plot of all candidate scores at two places -at a facility of 100% and a facility of 20%. These correspond to all candidates getting the item correct and all candidates guessing the outcome. We observed a strong curvilinear distribution showed by Borderline candidates compared to the whole cohort. This relationship was well described by an exponential of the form y ≈ C•exp(F•x), where y is the Facility of Borderline candidates on that Item, x is the observed Item Facility of the whole cohort, and C and F are constants.In our previous study we had found C and F had similar values under different conditions. Ideal values for C and F of 12.3 and 0.021, intercept the self-plot of item Facilities very close to 100% and 20%. In this study, we again observed values for C and F close to these ideal values: C = 10.06 and F = 0.0231. Differentiating the equation indicates where the assessment ought to be most sensitive.Differentiating the ideal curve of the difference between all candidates and Borderline candidates suggests an item facility at which the sensitivity of discrimination between the cohort and the borderline candidates is at a maximum. This value is approximately 64.5%.
Conclusions:This approach can be used to standard-set assessments in their entirety when they are low stakes or norm referenced, in preference to Cohen methods. While Cohen methods are based on the performance of one candidate (or a very small number of candidates), this exponential method is based on all candidates and all items and is therefore more robust. In high stakes assessments, it can be used to correct values where the Facility is very different from the standard-set value, and its use in this context for the UK General Medical Council proposed national exam. It could also be used to stand...