Brian Gong scite author profile

Local assessment systems are being marketed as formative, benchmark, predictive, and a host of other terms. Many so‐called formative assessments are not at all similar to the types of assessments and strategies studied by Black and Wiliam (1998) but instead are interim assessments. In this article, we clarify the definition and uses of interim assessments and argue that they can be an important piece of a comprehensive assessment system that includes formative, interim, and summative assessments. Interim assessments are given on a larger scale than formative assessments, have less flexibility, and are aggregated to the school or district level to help inform policy. Interim assessments are driven by their purpose, which fall into the categories of instructional, evaluative, or predictive. Our intent is to provide a specific definition for these “interim assessments” and to develop a framework that district and state leaders can use to evaluate these systems for purchase or development. The discussion lays out some concerns with the current state of these assessments as well as hopes for future directions and suggestions for further research.

show abstract

Students' perceptions and designs of simple control systems

Mioduser

Venezky

Gong

1996

Computers in Human Behavior

View full text Add to dashboard Cite

Alternate Assessments as One Measure of Teacher Effectiveness

Kearns

Kleinert

Thurlow

et al. 2015

Research and Practice for Persons with Severe Disabilities

View full text Add to dashboard Cite

Elementary and Secondary Education Act (ESEA) flexibility requires states to develop and implement teacher effectiveness measures that consider student assessment results, including assessment results for students with disabilities participating in general and alternate assessments. We describe how alternate assessment results for students with significant cognitive disabilities could appropriately be used in teacher effectiveness measures. In addition, we discuss the unique parameters faced by teachers serving students with significant cognitive disabilities that may warrant a multiple measures approach to evaluating teacher effectiveness. Using one of the two national initiatives presently developing alternate assessments based on the Common Core State Standards as an example, we describe how these new assessments might be applied in measuring teacher effectiveness. Finally, we offer implications for both policy makers and practitioners in measuring teacher effectiveness for teachers serving students with significant cognitive disabilities participating in alternate assessments.

show abstract

Are the Standards for Educational and Psychological Testing Relevant to State and Local Assessment Programs?

Diaz-Bilello

Patelis

Marion

et al. 2014

Educational Measurement

View full text Add to dashboard Cite

Agreement Between Expert System and Human Ratings of Constructed‐responses to Computer Science Problems

Bennett

Gong

Kershaw

et al. 1988

ETS Research Report Series

View full text Add to dashboard Cite

If computers can be programmed to score complex constructed response items, substantial savings in selected ETS programs might be realized and the development of mastery assessment systems that incorporate “real‐world” tasks might be facilitated. This study investigated the extent of agreement between MicroPROUST, a prototype microcomputer‐based expert scoring system, and human readers for two Advanced Placement Computer Science free‐response items. To assess agreement, a balanced incomplete block design was used with two groups of four readers grading 43 student solutions to the first problem and 45 solutions to the second. Readers assigned numeric grades and diagnostic comments in separate readings. Results showed MicroPROUST to be unable to grade a significant portion of solutions, but to perform impressively on those solutions it could analyze. For one problem, MicroPROUST assigned grades and diagnostic comments similar to those assigned by readers. For the other problem, MicroPROUST's agreement with readers on grades was lower than the agreement of readers among themselves, its grades were higher, and it gave fewer comments, particularly on structure and style. The extent of disagreement on grades, however, was small and much of the disagreement disappeared when papers were rescored discounting style. MicroPROUST's interchangeability with human readers on one problem suggests that there are conditions under which automated scoring of complex constructed‐responses might be implemented by ETS.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Brian Gong

Moving Toward a Comprehensive Assessment System: A Framework for Considering Interim Assessments

Students' perceptions and designs of simple control systems

Alternate Assessments as One Measure of Teacher Effectiveness

Are the Standards for Educational and Psychological Testing Relevant to State and Local Assessment Programs?

Agreement Between Expert System and Human Ratings of Constructed‐responses to Computer Science Problems

Contact Info

Product

Resources

About