Although many publications in the use of computers in language teaching and second language acquisition have included sections on computer and web-based language testing, there are few if any devoted to just this subject. The book under review includes different interrelated issues from discussion topics such as the benefits of, or controversies about computer language testing (CALT), to very practical ones like how to work with WebCT. Researchers and students in the field will see its potential due to the large number of well-informed sources, its bibliography, and proposals for further study.
Scores on speaking tests are used as evidence of both learner language ability and the second language acquisition process, and most speaking tests include scoring rubrics to help ensure that ratings are reliable and reflect a theoretical construct of speaking ability. Nevertheless, it may be that similar ratings on a speaking test in fact represent qualitatively different learner performances. Such a situation would mean that interpretations of ability or acquisition process based on such test scores may not be valid. The purpose of this article is to investigate the hypothesis that similar quantitative scores on a speaking test represent qualitatively different performances. The results of the study raise a number of issues for further investigation.
In this article, we propose to follow up on the most recent ARAL survey article on trends in computer-based second language assessment (Jamieson, 2005) and review developments in the use of technology in the creation, delivery, and scoring of language tests. We will discuss the promise and threats associated with computer-based language testing, including the language construct in relation to computer-based delivery and response technologies; computer-based authoring options; current developments; scoring, feedback, and reporting systems; and validation issues.
Typically in assessment of Language for Specific Purposes (LSP), test content and methods are derived from an analysis of the target language use (TLU) situation. However, the criteria by which performances are judged are seldom derived from the same source. In this article, I argue that LSP assessment criteria should be derived from an analysis of the TLU situation, using the concept of indigenous assessment criteria (Jacoby, 1998). These criteria are defined as those used by subject specialists in assessing communicative performances of both novices and colleagues in academic, professional and vocational fields. Performance assessment practices are part of any professional culture, from formal, gatekeeping examination procedures, to informal, ongoing evaluation built into everyday interaction. I suggest a procedure for deriving assessment criteria from an analysis of the TLU situation and explore problems associated with doing so, recommending a ‘weak’ indigenous assessment hypothesis to assist in the development of LSP test assessment criteria and guide interpretations of test performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.