Danielle Barth scite author profile

A multimodel inference approach to categorical variant choice: construction, priming and frequency effects on the choice between full and contracted forms of am, are and is Abstract: The present paper presents a multimodel inference approach to lin guistic variation, expanding on prior work by Kuperman and Bresnan (2012). We argue that corpus data often present the analyst with high model selection uncer tainty. This uncertainty is inevitable given that language is highly redundant: ev ery feature is predictable from multiple other features. However, uncertainty in volved in model selection is ignored by the standard method of selecting the single best model and inferring the effects of the predictors under the assumption that the best model is true. Multimodel inference avoids committing to a single model. Rather, we make predictions based on the entire set of plausible models, with contributions of models weighted by the models' predictive value. We argue that multimodel inference is superior to model selection for both the ILanguage goal of inferring the mental grammars that generated the corpus, and the ELanguage goal of predicting characteristics of future speech samples from the community represented by the corpus. Applying multimodel inference to the classic problem of English auxiliary contraction, we show that the choice between multimodel inference and model selection matters in practice: the best model may contain predictors that are not significant when the full set of plau sible models is considered, and may omit predictors that are significant consid ering the full set of models. We also contribute to the study of English auxiliary contraction. We document the effects of priming, contextual predictability, and specific syntactic constructions and provide evidence against effects of phono logical context.

Evaluating Logistic Mixed-Effects Models of Corpus-Linguistic Data in Light of Lexical Diffusion

Kapatsinski

2018

Effects of average and specific context probability on reduction of function wordsBEandHAVE

2019

In a study of word shortening of HAVE and contraction of BE, it is found that both high transitional probability and high average context probability (low informativity) result in reduction. Previous studies have found this effect for content words and this study extend the findings to function words. Average context probability is by construction type, showing that words are shorter in constructions with high average predictability, namely in perfect constructions for HAVE and in future and progressive constructions for BE. These findings show that in cases of grammaticalization, it is not an increase in frequency that results in reduction, but a decrease in informativity.

Discourse motivations for pronominal and zero objects across registers in Vera'a

Schnell

2018

Lang Var Change

The choice between pronominal and zero form for objects in the Oceanic language Vera'a is investigated quantitatively in texts from two registers with discourse topics of three different ontological class memberships. Discourse topicality is found to predict best the choice between pronoun and zero, outranking the factors of ontological class membership, antecedent form, and antecedent function. Contrary to current models of referent tracking, antecedent distance does not show any effect at all. It is concluded that (a) discourse structure and activation are not universally the most significant factors in referential choice and (b) ontological class and discourse topicality can be teased apart through appropriate text sampling, and it is the latter that is most significant. This bears important implications for the grammaticalization of object agreement and the typology of differential object marking.

Subgrouping the Sogeram languages

Daniels

Barth³

2019

JHL

Historical Glottometry is a method, recently proposed by Kalyan and François (François 2014; Kalyan & François 2018), for analyzing and representing the relationships among sister languages in a language family. We present a glottometric analysis of the Sogeram language family of Papua New Guinea and, in the process, provide an evaluation of the method. We focus on three topics that we regard as problematic: how to handle the higher incidence of cross-cutting isoglosses in the Sogeram data; how best to handle lexical innovations; and what to do when the data do not allow the analyst to be sure whether a given language underwent a given innovation or not. For each topic we compare different ways of coding and calculating the data and suggest the best way forward. We conclude by proposing changes to the way glottometric data are coded and calculated and the way glottometric results are visualized. We also discuss how to incorporate Historical Glottometry into an effective historical-linguistic research workflow.