This article considers how the nature of interaction may best be represented in the second language (L2) construct. The starting point is Bachman's model of communicative language ability which, it is argued, incorporates interaction from an individual-focused cognitive perspective. The alternative view advocated here is that individual ability and contextual facets interact in ways that change them both. Thus, 'an ability -in language user -in context' view is to be preferred over Bachman's 'ability -in language user' representation. Acceptance of this new approach entails a local, context-bound view of language ability, which is dif cult to reconcile with the tester's need for score generalizability. The way forward is to recognize that, while some contexts activate stable ability features, others produce more variable performance from learners. Thus, the focus of both theory formulation and empirical research should be on how to account for inconsistent performance in particular contexts from a social interactional perspective.
The purpose of this study is to derive the criteria/dimensions underlying learners' L2 oral ability scores across three tests: an oral interview, a narration and a read-aloud. A stimulus tape of 18 speech samples was presented to three native speaker rater groups for evaluation. The rater groups included teachers of Arabic as a foreign language in the USA, nonteaching Arabs residing in the USA for at least one year and nonteaching Arabs living in their home country (Lebanon). Each of the raters provided a holistic score for every speech sample. Holistic scores were analysed using the INDSCAL multidimensional scaling model. Results showed that the nonmetric three-dimensional solution provided a good fit to the data. Both regression and speech sample analyses were employed to identify those dimensions. Additionally, subject weights indicated that the three rater groups were emphasizing the three dimensions differentially, thus demonstrating that native speaker groups with varied backgrounds perceive the L2 oral construct differently. The study contends that researchers might need to reconsider employing generic component scales. A research approach that derives scales empirically according to the given tests and audiences, and according to the purpose of assessment, is recommended. Finally, replicating this study using other languages, L2 oral ability levels, tests and rater groups is suggested. I Theoretical background L2 oral testing increasingly calls for more performance-based tests. Performance-based tests require students to produce complex responses integrating various skills and knowledge and to apply their target language skills to life-like situations. Such tests typically employ more than one test method and call for human raters' judgement. Consequently, these two factors, the test method and the rater, have become integral components of performance-based tests that influence test scores.
Educational policies such as Race to the Top in the USA affirm a central role for testing systems in government-driven reform efforts. Such reform policies are often referred to as the global education reform movement (GERM). Changes observed with the GERM style of testing demand socially engaged validity theories that include consequential research. The article revisits the Standards and Kane’s interpretive argument (IA) and argues that the role envisioned for consequences remains impoverished. Guided by theory of action, the article presents a validity framework, which targets policy-driven assessments and incorporates a social role for consequences. The framework proposes a coherent system that makes explicit the interconnections among policy ambitions, testing functions, and the levels/sectors that are affected. The article calls for integrating consequences into technical quality documentation, demands a more realistic delineation of stakeholders and their roles, and compels engagement in policy research.
The article reviews the usefulness of several models of proficiency that have influenced second language testing in the last two decades. The review indicates that several factors contribute to the lack of congruence between models and test construction, and makes a case for distinguishing between theoretical models, which attempt to represent the proficiency construct in various contexts and oper ational assessment frameworks, which depict the construct in particular contexts. Additionally, the article underscores the significance of an empirical, contex tualized and structured approach to the development of assessment frameworks.
The paper investigates whether there is a shared perception of speaking proficiency among raters from different English speaking countries. More specifically, this study examines whether there is a significant difference among English language learning (ELL) teachers, residing in Australia, Canada, the UK, and the USA when rating speech samples of international English language students. Teachers were asked to rate samples from international students who took the test of spoken English (TSE), the oral component of TOEFL. Building on previous research which has demonstrated that different tasks and rater groups affect results obtained from learner performance on oral tests, this project investigated both rating variation as a result of country of origin and variations due to TSE task effects. Multivariate analyses were used to analyze the ratings. Also effect sizes are reported.The present study investigates whether teachers from different English speaking countries, all of whom are native speakers, share similar perceptions in terms of their rating of the oral performance of English as a second language (ESL) test takers. The impetus behind this question is the ever-increasing efforts by major language testing organizations to market their tests in countries or with populations for which they were originally not intended and/or for which appropriate research has not been carried out to justify such wide use.This study is the first investigation in a research agenda that seeks to address issues regarding the comparability and characteristics of ESL speaking test ratings awarded by different international rater groups. In the present investigation, the focus is on the oral test of the Test of English as a Foreign Language (TOEFL) program, i.e., Test of Spoken English (TSE). (1995) argues that two organizations have dominated the large-scale English language testing industry throughout the world: the University of Cambridge ESOL Examinations (previously University of Cambridge Local Examinations Syndicate or UCLES) and the Educational Testing Service (ETS). These two organizations have different orientations, ideologies, and approaches to measuring non-native speakers' English language ability (see Chalhoub-Deville and Turner, 2000). For example, tests produced by the University of Cambridge ESOL Examinations have tended to be more achievement based with strong connections to English language teaching syllabi. On the other hand, ETS has distanced itself from any specific instructional program/textbook, * Professor of Foreign Language and ESL Education, LARGE-SCALE ENGLISH LANGUAGE TESTING Spolsky
Many researchers and practitioners maintain that ACTFL's efforts to improve instructional practices and promote proficiency assessments tied to descriptors of what learners can do in real life have contributed significantly to second language teaching and testing. Similar endeavors in the area of research, however, are critically needed. Focusing on the oral proficiency interview (OPI), this article argues that ACTFL has a responsibility to its stakeholders to initiate a research program that generates a coherent combination of logical and empirical evidence to support its OPI interpretations and practices. The article highlights a number of high‐priority areas—including delimiting purposes, examining interview discourse, documenting rater/interlocutor behavior, explicating the native speaker criterion, and investigating the OPI's impact on language pedagogy—that should be incorporated into the research agenda.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.