In the dozen years since Glaser's (1963) seminal article on criterion-referenced testing, the acceptance of the concept of mastery as an educational and, hence, evaluation goal has grown tremendously. A large number of articles have been published, curriculum programs have been devised that employ criterionreferenced testing, and yet writers still feel it necessary to define what a criterion-referenced test is. Furthermore, the various published definitions are by no means equivalent. One also observes a shift in the interests and background of the authors of papers over this period. In the Sixties, writers were primarily advocating the adoption of criterion-referenced testing from an educational or philosophical point of view in spite of the reservations of the classical measurement theorists, whereas in the Seventies a new generation of measurement specialists have begun to be involved, and the papers are much more mathematical. A number of mathematically-based techniques for deciding
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. This content downloaded from 161.45. In the dozen years since Glaser's (1963) seminal article on criterion-referencedtesting, the acceptance of the concept of mastery as an educational and, hence, evaluation goal has grown tremendously. A large number of articles have been published, curriculum programs have been devised that employ criterionreferenced testing, and yet writers still feel it necessary to define what a criterion-referenced test is. Furthermore, the various published definitions are by no means equivalent. One also observes a shift in the interests and background of the authors of papers over this period. In the Sixties, writers were primarily advocating the adoption of criterion-referenced testing from an educational or philosophical point of view in spite of the reservations of the classical measurement theorists, whereas in the Seventies a new generation of measurement specialists have begun to be involved, and the papers are much more mathematical. A number of mathematically-based techniques for deciding The comments of Professor Frederick B. Davis, University of Pennsylvania Graduate School of Education, and Dean D. Dax Taylor, Southern Illinois University School of Medicine, on an earlier draft of this paper are gratefully acknowledged. 133 This content downloaded from 161.45.205.103 on Mon, 23 Mar 2015 11:21:46 UTC All use subject to JSTOR Terms and Conditions REVIEW OF EDUCATIONAL RESEARCH on cutting scores and related issues such as test length have been published-for a particularly good review see Millman (1973)-but the authors' conceptualizationsof the educational task facing the learners has not always been clear. The purpose of this paper is to investigate the mastery models underpinning the techniques being proposed and to review the procedures suggested for setting the pass-fail point.The adoption of the criterion-referenced approach to evaluation very quickly raises two measurement issues that have relatively less importance in norm-referenced testing. These can be broadly stated as the issue of the definition of mastery, and the issue of a priori standards-closely intertwined but different problems. Rigorous exploration of these to date has been quite minimal. Perhaps the development of content aspects has been of greater urgency. However, this area is receiving increasing attention. Measurement specialists have turned their attention to criterion-referenced measurement, introducing the use of decision theory and Bayesian statistics.The evolving models are alike in requiring tight specification of content areas. Objectives are to be written in sufficient detail so that the form and content of the measurements are implicit in the educational objective...
On 26 October 1974, 3356 diplomates of the American Board of Internal Medicine (ABIM) took a 1-day written examination for recertification consisting of multiple-choice, matching, and true-false questions derived from the American College of Physicians' Medical Knowledge Self-Assessment Program III and the ABIM Certifying Examination pool. The passing score was set by using a normative standard applied to a reference group of internists practicing general internal medicine who had had 2 or more years of residency training completed between the years 1949 and 1958. The passing score represented approximately 63% correct answers. The failure rate for the total number of examinees was 4.3%. Mean score of examinees showed an inverse relation with age but relatively slight differences when analyzed according to the degree of subspecialization, practice setting, hospital affiliation, or size of patient community.
This study compares physician performance on the Computer-Aided Simulation of the Clinical Encounter (CASE) with peer ratings and performance on multiple choice questions (MCQs) and patient management problems (PMPs). CASEis a simulation of the clinical encounter where the computer plays the role of the patient and the physician elicits information by entering "natural language" questions into a computer terminal. Results indicate that all formats are equally valid, although MCQs are the most reliable methods of assessment per unit of testing time, followed by PMPs and CASE, in that order. All methods measure the same or very highly correlated aspects of competence.
We investigated the performance of two groups of graduates of foreign medical schools on the 1975 and 1976 certification examinations of the American Board of Internal Medicine. Nearly all their postdoctoral residency training was obtained in the United States. The performance (most of those in this study were born in Asia and Southeast Asia) was much lower than that of graduates of United States medical schools. United States citizens who studied medicine abroad performed no better than alien graduates from foreign medical schools. Approximately half the foreign graduates born in the United States studied in Italy, and 10% in Switzerland, Mexico and Belgium. There were no significant differences in performance associated with the type of postdoctoral training (university, university-affiliated, community or other) undertaken in the United States. A significant inverse relation was observed between the interval from completion of training to first examination and the examination performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.