This study analyzed the contribution of lexical factors to native-speaking raters’ assessments of comprehensibility and nativeness in second language (L2) speech. Using transcribed samples to reduce non-lexical sources of bias, 10 naïve L1 English raters evaluated speech samples from 97 L2 English learners across two tasks (picture description and TOEFL integrated). Subsequently, the 194 transcripts were analyzed through statistical software (e.g., Coh-metrix, VocabProfile) for 29 variables spanning various lexical dimensions. For the picture description task, separation in lexical correlates of the two constructs was found, with distinct lexical measures tied to comprehensibility and nativeness. In the TOEFL integrated task, comprehensibility and nativeness were largely indistinguishable, with identical sets of lexical variables, covering dimensions of diversity and range. Findings are discussed in relation to the acquisition, assessment, and teaching of lexical properties in L2 speech.
Formulaic sequences (FSs), or prefabricated multi-word structures (e.g. on the other hand), are often difficult to identify objectively, and current corpusdriven methods yield structurally incomplete, overlapping, or overly extended structures of questionable psychological validity and pedagogical usefulness. To address these limitations, this study evaluated transitional probability as a potential metric to improve the identification of FSs by presenting 100 fourword sequences from the British National Corpus, varying in transitional probabilities between words, to native and non-native speakers of English (N = 293) in a sequence completion task (e.g. for the sake__). Results revealed that the application of transitional probability reduces many of the problems associated with current approaches to FS identification and can produce lists of FSs that are more functionally salient and psychologically valid.Keywords: formulaic sequences, formulaic language, lexical bundles, n-grams, corpus-driven research Les expressions stéréotypées (ES), ou les séquences préfabriquées (par exemple, on the other hand) sont souvent difficiles à identifier objectivement et les méthodes actuelles basées sur des corpus produisent des structures incomplètes, se chevauchant, ou excessivement étendues, ce qui remet en question leur validité psychologique et leur utilité pédagogique. Pour pallier ces limites, cette étude a évalué le potentiel d'une métrique basée sur la probabilité de transition dans le but d'améliorer l'identification des ES. Pour cela, 100 séquences de quatre mots tirées du British National Corpus, variant en probabilité de transition entre les mots, ont été présentées à des locuteurs natifs et non natifs en anglais (n = 293) lors d'une tâche consistant à compléter les séquences (par exemple, for the sake__). Les résultats ont révélé que l'application de la probabilité de transition circonscrit plusieurs des problèmes associés aux approches actuelles d'identification de ES, et peut produire des listes de ES plus fonctionnellement saillantes et psychologiquement valides.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.