Graphical Modeling: A New Response Type for Measuring the Qualitative Component of Mathematical Reasoning

Bennett, Randy Elliot; Morley, Mary; Quardt, Dennis; Rock, Donald A.

doi:10.1207/s15324818ame1303_5

Cited by 8 publications

(6 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The system displays feedback appropriate to the inferred (common) error. (Bennett, Morley, Quardt, & Rock, 2000).…”

Section: Example Of An Integrated Task Set In Mimmentioning

confidence: 99%

Tensions, Trends, Tools, and Technologies: Time for an Educational Sea Change

Shute¹

2006

ETS Research Report Series

View full text Add to dashboard Cite

This report outlines three educational approaches: (a) traditional, the currently dominant approach, a largely lecture-oriented, authoritarian style that makes heavy use of assessments of learning, which are useful for accountability purposes but only marginally useful for guiding day-to-day instruction; (b) progressive, a highly student-centered approach that relies on assessments for learning, which can be very useful in guiding day-to-day instruction; and (c) unified, a new, integrated approach that uses the best of both kinds of assessments-for and of learning-and which leverages computer technology, educational measurement, and cognitive science to address factors that undermined earlier attempts to implement the progressive approach. This report examines some of the research, trends, and factors that should be considered, understood, and, in some cases, leveraged, in order to move toward the unified approach. Further, this report presents examples of how ETS projects are moving toward the new approach to harness assessment in the service of learning.

show abstract

“…The system displays feedback appropriate to the inferred (common) error. (Bennett, Morley, Quardt, & Rock, 2000).…”

Section: Example Of An Integrated Task Set In Mimmentioning

confidence: 99%

Tensions, Trends, Tools, and Technologies: Time for an Educational Sea Change

Shute¹

2006

ETS Research Report Series

View full text Add to dashboard Cite

show abstract

“…The reactions to GM followed the same pattern obtained for other constructed-response items. Bennett et al (2000) found that more examinees preferred MC graphical reasoning questions to GM items (41% to 35%). However, when asked which was the fairer indicator of their ability to undertake graduate study, their preferences reversed (42% to 31%).…”

Section: Figurementioning

confidence: 95%

“…That the slope of the equation is positive. Bennett, Morley, Quardt, & Rock (2000) designed two GM forms and created scoring keys from the problem information contained in the item stems. They administered the forms to 243 individuals planning to attend graduate school or enrolled as first-year graduate students.…”

Section: Graphical Modelingmentioning

confidence: 99%

“…Psychometric characteristics. The psychometric characteristics of GM were also investigated by Bennett et al (2000). GM was found to be highly reliable, with coefficient α values in the low .90s (for an 18-item test).…”

Section: Graphical Modelingmentioning

confidence: 99%

“…Reproduced by Permission.) Bennett et al (2000) provided data relevant to the potential for GM to produce adverse impact. Because GM is graphical, adverse impact is a reasonable concern, especially given the well-established finding that men outperform women on certain spatial tasks (Linn & Petersen, 1985).…”

Section: Figurementioning

confidence: 99%

See 2 more Smart Citations

Three Response Types for Broadening the Conception of Mathematical Problem Solving in Computerized Tests

Bennett¹,

Morley²,

Quardt

2000

Applied Psychological Measurement

Self Cite

View full text Add to dashboard Cite

Three open-ended response types-mathematical expression (ME), generating examples (GE), and graphical modeling (GM)-are described that could broaden the conception of mathematical problem solving used in computerized admissions tests. ME presents single-best-answer problems that call for an algebraic formalism, the correct rendition of which can take an infinite number of surface forms. GE presents loosely structured problems that can have many good answers taking the form of a value, letter pattern, expression, equation, or list. GM asks the examinee to represent a given situation by plotting points on a grid; these items can have a single best answer or multiple correct answers. For the three basic types, sample items are provided, the examinee interfaces and approaches to automated scoring are described, and research results are reported. It is illustrated how ME, GE, and GM can be combined to form extended constructed-response problems, and a description is offered of how item classes might be used as a basis for creating production-ready scoring keys. Index terms: automated scoring, computerbased testing, constructed response, mathematics performance assessment.The traditional paper-and-pencil (P&P), multiple-choice (MC) item format consists of a static stimulus followed by a series of response options. This format has served testing programs well for many years because its compactness allows for great breadth of coverage-many items can be administered in a short period. It is also cost efficient because it can be machine scored.However, this traditional format cannot effectively measure some constructs; in particular, when the target construct requires either a dynamic stimulus (e.g., listening comprehension) or a complex response (e.g., writing an essay, composing a computer program, producing a building design). To handle constructs requiring dynamic stimuli, large-scale testing programs have typically combined P&P with video or audio tape, producing a serviceable but expensive and administratively cumbersome assessment. To accommodate constructs calling for complex responses, testing programs have increasingly employed performance tasks, which also can be uneconomical due to the need for human scoring.Computerized testing has brought with it the potential for "new" assessment tasks. Some of these tasks might be more efficiently delivered versions of tasks used in traditional testing programs; others might be truly new, in that they measure constructs that could not be measured by P&P MC tests.These new tasks can be divided into three classes. In the first class, the stimulus is dynamic. An operational example is the Listening section of the Test of English as a Foreign Language (Educational Testing Service, 1999). Digitally recorded audio and context-setting photos are presented, followed by MC questions. One advantage of this digital presentation is the consistency in quality with which the same audio stimulus can be delivered from one examinee to the next.

show abstract

A Historical Survey of Research Regarding Constructed-Response Formats

Bejar

2017

Methodology of Educational Measurement and Assessment

View full text Add to dashboard Cite

This chapter chronicles ETS research and development contributions related to the use of constructed-response item formats. 1 The use of constructed responses in testing dates back to imperial China, where tests were used in the selection of civil servants. However, in the United States, the multiple-choice format became dominant during the twentieth century, following its invention and use by the SAT ® examinations created by the College Board in 1926. When ETS was created in 1947, post-secondary admissions testing was largely based on tests consisting of multiplechoice items. However, from the start, there were two camps at ETS: those who believed that multiple-choice tests were sufficiently adequate for the purpose of assessing "verbal" skills and those who believed that "direct" forms of assessment requiring written responses had a role to play. For constructed-response formats to regain a foothold in American education several hurdles would need to be overcome. Research at ETS was instrumental in overcoming those hurdles.The first hurdle was that of reliability, specifically the perennial issue of low interrater agreement, which plagued the acceptance of constructed-response formats for most of the twentieth century. The second hurdle was broadening the conception of validity to encompass more than predictive considerations, a process that began with the introduction of construct validity by Cronbach and Meehl (1955). Samuel Messick at ETS played a crucial role in this process by making construct validity relevant to educational tests. An inflexion point in the process of reincorporating constructed-response formats more widely in educational tests was marked 1 Constructed responses to a prompt or question can range in scope and complexity. Perhaps the most common constructed response is the written essay. However, short written responses to questions are also considered to be constructed, as are spoken answers in response to a prompt, mathematical responses (equations, plotted functions, etc.), computer programs, and graphical responses such as architectural designs.

show abstract

Graphical Modeling: A New Response Type for Measuring the Qualitative Component of Mathematical Reasoning

Cited by 8 publications

References 9 publications

Tensions, Trends, Tools, and Technologies: Time for an Educational Sea Change

Tensions, Trends, Tools, and Technologies: Time for an Educational Sea Change

Three Response Types for Broadening the Conception of Mathematical Problem Solving in Computerized Tests

A Historical Survey of Research Regarding Constructed-Response Formats

Contact Info

Product

Resources

About