Brett Hashimoto scite author profile

Egbert

2019

Language Learning

Frequency is often the only variable considered when researchers or teachers develop vocabulary materials for second language (L2) learners. However, researchers have also found that many other variables affect vocabulary acquisition. In this study, we explored the relationship between L2 vocabulary acquisition and a variety of lexical characteristics using vocabulary recognition test data from L2 English learners. Conducting best subsets multiple regression analysis to explore all possible combinations of variables, we produced a best‐fitting model of vocabulary difficulty consisting of six variables (R2 = .37). The fact that many variables significantly contributed to the regression model and that a large amount of variance remained yet unexplained by the frequency variable considered in this study indicates that much more than frequency alone affects the likelihood that learners will learn certain L2 words.

Is Frequency Enough?: The Frequency Model in Vocabulary Size Testing

Language Assessment Quarterly

2021

How operationalizations of word types affect measures of lexical diversity

Jarvis

2021

IJLCR

This study tests three measures of lexical diversity (LD), each using five operationalizations of word types. The measures include MTLD (measure of textual lexical diversity), MTLD-W (moving average MTLD with wrap-around measurement), and MATTR (moving average type-token ratio). Each of these measures is tested with types operationalized as orthographic forms, lemmas using automated POS tags, lemmas using manually corrected POS tags, flemmas (list-based lemmas that do not distinguish between parts of speech), and word families. These measures are applied to 60 narrative texts written in English by adolescent native speakers of English (n = 13), Finnish (n = 31), and Swedish (n = 16). Each individual LD measure is evaluated in relation to how well it correlates with the mean LD ratings of 55 human raters whose inter-rater reliability was exceedingly high (Cronbach’s alpha = .980). The overall results show that the three measures are comparable but two of the operationalizations of types produce mixed results across measures.

Using a corpus in creating and evaluating a DCT

Nelson

2020

Discourse Completion Tasks (DCTs) have been one of the most popular tools in pragmatics research. Yet, many have criticized DCTs for their lack of authenticity (e.g., Culpeper, Mackey, & Taguchi, 2018; Nguyen, 2019). We propose that corpora can serve as resources in designing and evaluating DCTs. We created a DCT using advice-seeking prompts from the Q+A corpus (Baker & Egbert, 2016). Then, we administered the DCT to 33 participants. We evaluated the DCT by (1) comparing the linguistic form and the semantic content of the participants’ DCT responses (i.e., advice-giving expressions) with authentic data from the corpus; and (2) interviewing the participants about the instrument quality. Chi-square tests between DCT data and corpus data revealed no significant differences in advice-giving expressions in terms of both the overall level of directness (χ2 [2, N = 660] = 6.94, p = .03, V = .10) and linguistic realization (χ2 [8, N = 660] = 17.75, p = .02, V = .16), and showed a significant difference but small effect size in terms of semantic content (χ2 [6, N = 512] = 30.35, p < .01, V = .24). Taken together with the interview data, our findings indicate that corpora are useful in designing DCTs.

Research in progress: Applied linguistics at Northern Arizona University, USA

et al. 2020

Vocabulary of American-English Size Test

Hashimoto¹

2021

Corpus of Founding Era American English: designing a corpus for interpreting the United States Constitution

Hashimoto¹

2023

Corpora

The original meaning of words or phrases is often in dispute in Founding Era legislation, especially the US Constitution. The Corpus of Founding Era American English (cofea) accurately provides evidence for the meaning of contested terms during the Founding Era. cofea consists of 126,394 texts and over 136 million words. This corpus has been and is being used by legal researchers and interpreters in scholarly research as well as various courts, including the Supreme Court. This paper describes the motivation for the creation of cofea and describes the process of designing and collecting the corpus.

A multi-measure approach for lexical diversity in writing assessments: Considerations in measurement and timing

2023