This study investigates aspects of validity reflected in a large and diverse sample of published measures used in educational and psychological testing contexts. The current edition of Mental Measurements Yearbook served as the data source for this study. The validity aspects investigated included perspective on validity represented, number and kinds of sources of validity evidence provided, overall evaluation of the favorability of the test, and whether these factors varied as a function of the type of test. Findings reveal that validity information is not routinely provided in terms of modern validity theory, some sources of validity evidence (e.g., consequential) are essentially ignored in validity reports, and the favorability of judgments about a test is more strongly related to the number of validity sources provided than to the perspective on validity taken or other factors. The article concludes with implications for extending and refining current validity theory and validation practice.
This module describes some common standard‐setting procedures used to derive performance levels for achievement tests in education, licensure, and certification. Upon completing the module, readers will be able to: describe what standard setting is; understand why standard setting is necessary; recognize some of the purposes of standard setting; calculate cut scores using various methods; and identify elements to be considered when evaluating standard‐setting procedures. A self‐test and annotated bibliography are provided at the end of the module. Teaching aids to accompany the module are available through NCME.
A sample of 143 midwestern elementary and secondary school teachers from avariety of practice settings responded t c~ a survey and provided comments regarding their assessment practices The purpose of the survey was to collect background (demographic) information on the teachers and information on several assessment-related ~ractices, including frequency WI th which teachers assign routine class assignments, types of marks used to report student performance, frequency and grading of major assignments and tests, source of t:Iassroom tests, kinds of marks used, methods used to combine marks, meaning of grades, teachers' knowledge and perceptions regarding district grading polici~:~, and teachers' awareness of the gradmg policies of their peers. Interviews with the teachers provided adhtional insights into their practices. Results indicated that teachers' assessment practices were highly variable and unpredictable from characteristics such as practice settmg, gender, years of experience, grade level, or familiarity with assessment policies in their school district. Teachers generally claim to consider and incorporate a variety of objective and subjective factors when assigning grades on assignments, assessments, and report cards, synthesizing diverse kinds of information about achievement in ways that tend to maximize the likelihood that students will achleve high grades. Only about one half of the teachers surveyed indicated that they were aware OF their districts' policies on grahng; most were not aware of the assessment practices of their colleagues. Many teachers seemed to have Requests for reprints shcluld be sent to Gregory J. Cizek Department of Educaaonal Psychology, Research, and Foundations, 350 Snyder Hall, University of Toledo, Toledo, OH 43606-3390. Downloaded by [University of Toronto Libraries] at 00:06 26 December 2014 individual assessment policies that reflected their own individualistic values and beliefs about teaching. Recommendations for making grades more meaningful ways of communicating about student performance are suggested.Much of the recent renewed interest in educational assessment has been targeted toward two aspects: (a) large-scale testing and its uses and influences on teaching and learning, and*) investigations of alternate assessment formats. These concerns are related: They both focus on information gathering. As Airasian (1994) argued, nearly all of the assessment-related activities in wbich teachers engage can be conceived broadly as information gathering. Airasian defined assessment as "the process of collecting, synthesizing, and interpreting information to aid in [educational] decidon-making" (p. 5). In contrast with the recent attention to information gathehng, comparatively little attention has been given to information reporting. This component is 'exemplified by the assigning of grades, marks, or sumrnative evaluations of student performance.The study reported in this article continues a line of research into the marks teachers assign to students' academic performances. S...
The concept of validity has suffered because the term has been used to refer to 2 incompatible concerns: the degree of support for specified interpretations of test scores (i.e., intended score meaning) and the degree of support for specified applications (i.e., intended test uses). This article has 3 purposes: (a) to provide a brief summary of current validity theory, (b) to illustrate the incompatibility of incorporating score meaning and score use into a single concept, and (c) to propose and describe a framework that both accommodates and differentiates validation of test score inferences and justification of test use.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.