Frequency is often the only variable considered when researchers or teachers develop vocabulary materials for second language (L2) learners. However, researchers have also found that many other variables affect vocabulary acquisition. In this study, we explored the relationship between L2 vocabulary acquisition and a variety of lexical characteristics using vocabulary recognition test data from L2 English learners. Conducting best subsets multiple regression analysis to explore all possible combinations of variables, we produced a best‐fitting model of vocabulary difficulty consisting of six variables (R2 = .37). The fact that many variables significantly contributed to the regression model and that a large amount of variance remained yet unexplained by the frequency variable considered in this study indicates that much more than frequency alone affects the likelihood that learners will learn certain L2 words.
This study tests three measures of lexical diversity (LD), each using five operationalizations of word types. The measures include MTLD (measure of textual lexical diversity), MTLD-W (moving average MTLD with wrap-around measurement), and MATTR (moving average type-token ratio). Each of these measures is tested with types operationalized as orthographic forms, lemmas using automated POS tags, lemmas using manually corrected POS tags, flemmas (list-based lemmas that do not distinguish between parts of speech), and word families. These measures are applied to 60 narrative texts written in English by adolescent native speakers of English (n = 13), Finnish (n = 31), and Swedish (n = 16). Each individual LD measure is evaluated in relation to how well it correlates with the mean LD ratings of 55 human raters whose inter-rater reliability was exceedingly high (Cronbach’s alpha = .980). The overall results show that the three measures are comparable but two of the operationalizations of types produce mixed results across measures.
Discourse Completion Tasks (DCTs) have been one of the most popular tools in pragmatics research. Yet, many have
criticized DCTs for their lack of authenticity (e.g., Culpeper, Mackey, & Taguchi,
2018; Nguyen, 2019). We propose that corpora can serve as resources in
designing and evaluating DCTs. We created a DCT using advice-seeking prompts from the Q+A corpus (Baker & Egbert, 2016). Then, we administered the DCT to 33 participants. We evaluated the DCT by (1)
comparing the linguistic form and the semantic content of the participants’ DCT responses (i.e., advice-giving expressions) with
authentic data from the corpus; and (2) interviewing the participants about the instrument quality. Chi-square tests between DCT
data and corpus data revealed no significant differences in advice-giving expressions in terms of both the overall level of
directness (χ2 [2, N = 660] = 6.94, p = .03, V = .10) and
linguistic realization (χ2 [8, N = 660] = 17.75, p = .02,
V = .16), and showed a significant difference but small effect size in terms of semantic content
(χ2 [6, N = 512] = 30.35, p < .01, V = .24). Taken
together with the interview data, our findings indicate that corpora are useful in designing DCTs.
The original meaning of words or phrases is often in dispute in Founding Era legislation, especially the US Constitution. The Corpus of Founding Era American English (cofea) accurately provides evidence for the meaning of contested terms during the Founding Era. cofea consists of 126,394 texts and over 136 million words. This corpus has been and is being used by legal researchers and interpreters in scholarly research as well as various courts, including the Supreme Court. This paper describes the motivation for the creation of cofea and describes the process of designing and collecting the corpus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.