Coleman Haley scite author profile

Coleman Haley

4Publications

6Citation Statements Received

218Citation Statements Given

How they've been cited

How they cite others

107

202

Affiliations

University of Edinburgh, Johns Hopkins University

Publications

Order By: Most citations

This is a BERT. Now there are several of them. Can they generalize to novel words?

Haley¹

2020

View full text Add to dashboard Cite

Recently, large-scale pre-trained neural network models such as BERT have achieved many state-of-the-art results in natural language processing. Recent work has explored the linguistic capacities of these models. However, no work has focused on the ability of these models to generalize these capacities to novel words. This type of generalization is exhibited by humans (Berko, 1958), and is intimately related to morphology-humans are in many cases able to identify inflections of novel words in the appropriate context. This type of morphological capacity has not been previously tested in BERT models, and is important for morphologically-rich languages, which are under-studied in the literature regarding BERT's linguistic capacities. In this work, we investigate this by considering monolingual and multilingual BERT models' abilities to agree in number with novel plural words in English, French, German, Spanish, and Dutch. We find that many models are not able to reliably determine plurality of novel words, suggesting potential deficiencies in the morphological capacities of BERT models.

show abstract

Morphology Matters: A Multilingual Language Modeling Analysis

Park

Zhang

Haley

et al. 2021

View full text Add to dashboard Cite

Prior studies in multilingual language modeling (e.g., Cotterell et al., 2018; Mielke et al., 2019) disagree on whether or not inflectional morphology makes languages harder to model. We attempt to resolve the disagreement and extend those studies. We compile a larger corpus of 145 Bible translations in 92 languages and a larger number of typological features.1 We fill in missing typological data for several languages and consider corpus-based measures of morphological complexity in addition to expert-produced typological features. We find that several morphological measures are significantly associated with higher surprisal when LSTM models are trained with BPE-segmented data. We also investigate linguistically motivated subword segmentation strategies like Morfessor and Finite-State Transducers (FSTs) and find that these segmentation strategies yield better performance and reduce the impact of a language’s morphology on language modeling.

show abstract

Deep neural networks easily learn unnatural infixation and reduplication patterns

Haley¹,

Wilson²

2021

View full text Add to dashboard Cite

Communicative Efficiency or Iconic Learning: Do Acquisition and Communicative Pressures Interact to Shape Colour- Naming Systems?

Gyevnar

Dagan

Haley

et al. 2022

Entropy

View full text Add to dashboard Cite

Language evolution is driven by pressures for simplicity and informativity; however, the timescale on which these pressures operate is debated. Over several generations, learners’ biases for simple and informative systems can guide language evolution. Over repeated instances of dyadic communication, the principle of least effort dictates that speakers should bias systems towards simplicity and listeners towards informativity, similarly guiding language evolution. At the same time, it has been argued that learners only provide a bias for simplicity and, thus, language users must provide a bias for informativity. To what extent do languages evolve during acquisition versus use? We address this question by formally defining and investigating the communicative efficiency of acquisition trajectories. We illustrate our approach using colour-naming systems, replicating the communicative efficiency model of Zaslavsky, Kemp, Regier & Tishby (2018, PNAS) and the acquisition model of Beekhuizen & Stevenson (2018, Cogn. Sci.). We find that to the extent that language is iconic, learning alone is sufficient to shape language evolution. Regarding colour-naming systems specifically, we find that incorporating learning biases into communicative efficiency accounts might explain how speakers and listeners trade off communicative effort.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Coleman Haley

This is a BERT. Now there are several of them. Can they generalize to novel words?

Morphology Matters: A Multilingual Language Modeling Analysis

Deep neural networks easily learn unnatural infixation and reduplication patterns

Communicative Efficiency or Iconic Learning: Do Acquisition and Communicative Pressures Interact to Shape Colour- Naming Systems?

Contact Info

Product

Resources

About