Natural language makes considerable use of recurrent formulaic patterns of words. This article triangulates the construct of formula from corpus linguistic, psycholinguistic, and educational perspectives. It describes the corpus linguistic extraction of pedagogically useful formulaic sequences for academic speech and writing. It determines English as a second language (ESL) and English for academic purposes (EAP) instructors' evaluations of their pedagogical importance. It summarizes three experiments which show that different aspects of formulaicity affect the accuracy and fluency of processing of these formulas in native speakers and in advanced L2 learners of English. The language processing tasks were selected to sample an ecologically valid range of language processing skills: spoken and written, production and comprehension. Processing in all experiments was affected by various corpus‐derived metrics: length, frequency, and mutual information (MI), but to different degrees in the different populations. For native speakers, it is predominantly the MI of the formula which determines processability; for nonnative learners of the language, it is predominantly the frequency of the formula. The implications of these findings are discussed for (a) the psycholinguistic validity of corpus‐derived formulas, (b) a model of their acquisition, (c) ESL and EAP instruction and the prioritization of which formulas to teach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.