There has been a long-standing complaint about civil service test results. Too often, highly regarded candidates do not score well enough to be reached for appointment, and poorly regarded candidates do score well, and block the appointments of the best candidates. Efforts have been made in recent years to expand selection beyond the "rule of three" and to provide for more flexible lateral movement. These efforts, however, do not address the underlying need for fixing the product.T his paper discusses the use of job performance assessment as a component for civil service promotion examinations. It will demonstrate how performance assessment can be used to assess the quality of existing tests and provide infor mation on whether changes in tests improve the results, and if so, by how much. This results-oriented methodology is at the forefront of measuring the performance of civil service examinations. More importantly, this paper will show that, when used as an examination component, performance assessments dramatically increase the validity and utility of civil service examinations. Moreover, so often annual performance eval uations are not completed or are simply pro forma; performance assessment provides the assurance that careful, objective assessment of staff's job performance is con ducted periodically, sending the message that job performance matters. This paper will also make the case that testing for competencies (knowledge, skills and abilities) is not sufficient for predicting how well candidates will perform when pro moted. Two additional factors must also be incorporated into the selection plan: work behaviors and work results. Applying the concept of "merit and fitness," this paper pro poses that testing for competencies meets the requirement for considering "fitness," but "meritorious service" can only be assessed by measuring work behaviors and work results via performance assessment. Although expensive in terms of staff time and effort required to complete the examination process, performance assessments are worth the added expense, due to the greater value of the final product.Rather than questioning the quality of existing civil service competency tests, this paper contends that these examinations are better than traditional measurements suggest. But even the best competency tests are insufficient to get the job done. Per formance assessment must also be included in the selection plan to get the best results. Rather than marking the end of competency testing, this paper identifies
ere has been an extensive debate within 1 the human resources industry over the merits of oral tests versus assembled, multiplechoice written tests. Written tests, claim their proponents, are more objective, more reliable and are better able to rank people on their competencies. For larger candidate populations, written tests are also much less expensive to administer.Oral test proponents contend that whatever their shortcomings, one can learn a great deal about candidates' communication skills, interpersonal skills and abilities to discuss a topic in depth during face-to-face discussions. For smaller candidate populations, oral tests are much closer in cost to written tests, and they may even be somewhat less expensive to administer and score. This article examines the inter-rater reliability of thirteen separate oral tests administered in 1996 by the New York State Department of Civil Service using statistical-based rating sheets (SBRS) as the measurement instrument. The article first seeks to determine whether oral test inter-rater reliability can be effectively measured with little interruption to the oral test process, and next examines the degree to which oral tests have inter-rater reliability, and hence have a claim to objectivity. It then looks at the effects of examiner collaboration on the initial or original judgments of the examiners. BACKGROUND Inter-Rater Reliability in Oral Tests Reliability is a term used in testing to measure the consistency of test results. While a test may have very favorable results once, what is the probability that these results will be repeated the next time the test is held? There are three types of reliability: inter-rater reliability (To what extent are the raters in agreement on the candidates' test performance?), the internal reliability of the measurement instrument (How effective is the measurement instrument in capturing useful information about different candidates' strengths and weaknesses?), and the reliability of the test materials (How likely are two candidates to produce the same results if they have essentially the same attributes? How likely is the same candidate to produce the same test results during a subsequent holding?). This article focuses on inter-rater reliability, but also provides statistical data on the internal reliability of the measurement instrument. Unless significantly flawed, a measurement instrument with more items will generally produce a higher internal reliability coefficient than one with fewer items.Kane (1992) discusses three levels of assessment : multiple-choice items, simulations (e.g. oral tests) and actual practice (work) situations. He viewed the merits of each from three inferences. The first inference is evaluation of the examination results. Are there correct answers, or will different raters disat UNIV OF GEORGIA LIBRARIES on May 30, 2015 rop.sagepub.com Downloaded from 44 agree on a candidate's performance? The second inference is generalization of the examination results. Since the examination represents just a sampl...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.