Plurale-tantum nouns (scissors, leggings, glasses) are an example of the systematic lack of an unmarked form of a lexeme. In contrast to singulare-tantum nouns, most notably mass nouns, this systematicity is mostly restricted to individual lexemes and analogously related ones (trousers, pants, knickers). It remains an open question whether there is any functionally motivated nominal subclass that goes beyond smaller lexical fields. The main goal of this paper is to estimate whether such extreme proportions in the absence or presence of inflectional markers cause distinctly high concentrations of lexemes, i.e. nominal subclasses. In a first step, the probabilities for a lemma to occur with plural -s were bootstrapped with replacement. Secondly, the bootstrapped data was equally split into 10 strata at varying inflection probabilities. Homonyms and polysemes that differ in their probability to be inflected are thus disambiguated. For each stratum, type frequencies were extrapolated by means of LNRE models. The same process was repeated for reference data sets containing verbal -ed and -ing. The bootstrapped data showed that frequency and proportion of inflection reveal clusters likely to represent different polysemes or homonyms. The type frequencies of the partially disambiguated singulare-tantum nouns turned out to be clearly distinct. However, for the plurale-tantum nouns, the extrapolated type frequencies were only marginally higher than those of the other suffixes, which are not usually thought to have a tantum-like subcategory.
Lexical ambiguity in the English language is abundant. Word-class ambiguity is even inherently tied to the productive process of conversion. Most lexemes are rather flexible when it comes to word class, which is facilitated by the minimal morphology that English has preserved. This study takes a multivariate quantitative approach to examine potential patterns that arise in a lexicon where verb-noun and noun-verb conversion are pervasive. The distributions of three inflectional suffixes, verbal -s, nominal -s, and -ed are explored for their interaction with degrees of verb-noun conversion. In order to achieve that, the lexical dispersion, context-dependency, and lexical similarity between the inflected and bare forms were taken into consideration and controlled for in a Generalized Additive Models for Location, Scale and Shape (GAMLSS; Stasinopoulos, M. D., R. A. Rigby, and F. De Bastiani. 2018. “GAMLSS: A Distributional Regression Approach.” Statistical Modelling 18 (3–4): 248–73). The results of a series of zero-one-inflated beta models suggest that there is a clear “uncanny” valley of lexemes that show similar proportions of verbal and nominal uses. Such lexemes have a lower proportion of inflectional uses when textual dispersion and context-dependency are controlled for. Furthermore, as soon as there is some degree of conversion, the probability that a lexeme is always encountered without inflection sharply rises. Disambiguation by means of inflection is unlikely to play a uniform role depending on the inflectional distribution of a lexeme.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.