1997
DOI: 10.1111/j.1745-3984.1997.tb00512.x
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating an Automatically Scorable, Open‐Ended Response Type for Measuring Mathematical Reasoning in Computer‐Adaptive Tests

Abstract: The first generation of computer‐based tests depends largely on multiple‐choice items and constructed‐response questions that can be scored through literal matches with a key. This study evaluated scoring accuracy and item functioning for an open‐ended response type where correct answers, posed as mathematical expressions, can take many different surface forms. Items were administered to 1,864 participants in field trials of a new admissions test for quantitatively oriented graduate programs. Results showed au… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
25
0

Year Published

1997
1997
2017
2017

Publication Types

Select...
4
4
2

Relationship

2
8

Authors

Journals

citations
Cited by 45 publications
(26 citation statements)
references
References 7 publications
1
25
0
Order By: Relevance
“…The inherent variability of open-ended solutions, and lack of defined evaluation criteria for design makes automatically assessing open-ended work challenging (Bennett et al 1997). In addition, automated systems frequently cannot capture the semantic meaning of answers, which limits the feedback that they can provide to help students improve (Bennett 1998;Hearst 2000).…”
Section: The Promise Of Peer Assessmentmentioning
confidence: 99%
“…The inherent variability of open-ended solutions, and lack of defined evaluation criteria for design makes automatically assessing open-ended work challenging (Bennett et al 1997). In addition, automated systems frequently cannot capture the semantic meaning of answers, which limits the feedback that they can provide to help students improve (Bennett 1998;Hearst 2000).…”
Section: The Promise Of Peer Assessmentmentioning
confidence: 99%
“…Appearing in this decade were ETS's first attempts at automated scoring, including of computer science subroutines (Braun et al 1990), architectural designs (Bejar 1991), mathematical step-by-step solutions and expressions (Bennett et al 1997;Sebrechts et al 1991), short-text responses (Kaplan 1992), and essays (Kaplan et al 1995). By the middle of the decade, the work on scoring architectural designs had been implemented operationally as part of the National Council of Architectural Registration Board's Architect Registration Examination (Bejar and Braun 1999).…”
Section: Constructed-response Formats and Performance Assessmentmentioning
confidence: 99%
“…The scoring accuracy for constructed-response items is generally lower than for multiple-choice items, but some in mathematics can be scored quite accurately, even compared to multiple-choice. For example, Bennett, Steffen, Singley, Morley, and Jacquemin (1997) found very high accuracy rates for the mathematical expressions (ME) response type when users entered expressions on the computer.…”
Section: Item Type Considerationsmentioning
confidence: 99%