“…The research on factors that influence judges' ratings during training and the stages of standard setting deal almost exclusively with item judgments and defining minimal competence (Melican & Mills, 1986& Mills, , 1987Mills, Melican, & Ahluwalia, 1991;Norcini, Shea, & Kanya, 1988;Plake, Melican, & Mills, 1991;Pulakos et al, 1989;Smith & Smith, 1988). Despite the limited focus of these studies, the criteria proposed by Reid (1991) for evaluating training effectiveness based on their findings can be generalized to the more recent standard-setting methods: (a) Judgments should be stable over time, (b) judgments should be consistent with item and test score performance, and (c) judgments should reflect realistic expectations.…”