The cognitive plausibility of statistical classification models: Comparing textual and behavioral evidence

Klavan, Jane; Divjak, Dagmar

doi:10.1515/flin-2016-0014

Cited by 45 publications

(17 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Second, this probabilistic knowledge is derived in large part from language experience, and so is subtly, but dynamically (re)constructed throughout speakers' lives. The probabilistic nature of grammar is supported by evidence showing that the likelihood of finding a particular linguistic variant in a particular context in a corpus corresponds to the intuitions that speakers have about the acceptability of the variants (see Bresnan & Ford 2010;Klavan & Divjak 2016). Bresnan (2007: 76-84), for example, used a scalar rating task based on corpus materials (transcriptions of spoken dialogue passages) as stimuli to model subjects' responses regarding the naturalness of dative variants in context.…”

Section: Introductionmentioning

confidence: 99%

General introduction: A comparative perspective on probabilistic variation in grammar

Grafmiller

Szmrecsanyi

Röthlisberger

et al. 2018

Glossa: A Journal of General Linguistics

View full text Add to dashboard Cite

This special collection brings together research exploring and evaluating probabilistic variation patterns from a comparative perspective, thus highlighting current work situated at the crossroads of research on usage-based theoretical linguistics, variationist linguistics, and sociolinguistics. The contributions in the collection advance our understanding of the plasticity of syntactic knowledge on the part of language users with diverse regional and/or cultural backgrounds, and demonstrate how a probabilistic approach to grammatical variation can offer insight into the scope and limits of language variation. In this general introduction to the special collection, we provide some essential background for perspective, and subsequently summarize the contributions in the collection.

show abstract

Section: Introductionmentioning

confidence: 99%

General introduction: A comparative perspective on probabilistic variation in grammar

Grafmiller

Szmrecsanyi

Röthlisberger

et al. 2018

Glossa: A Journal of General Linguistics

View full text Add to dashboard Cite

show abstract

“…One prolific area pertains to the discussion of how corpus-based frequency estimates relate to experimental findings, especially acceptability judgements (see Divjak 2016 for a recent overview). There has also been an exponential growth in published studies that use probabilistic statistical classification models to analyse linguistic data; see Klavan and Divjak (2016) for an overview. Still, only a small number of these studies have compared their findings with behavioral data (Roland et al 2006, Wasow andArnold 2003).…”

Section: Introductionmentioning

confidence: 99%

“…Klavan and, that without behavioral data it would be very difficult if not impossible to provide an adequate assessment of a corpus-based model. Linguistic experiments are necessary to calibrate our models -sometimes models are very accurate, and sometimes they appear to be less accurate; in order to set "upper and lower boundaries to what could be psychologically relevant" we need behavioral data to evaluate the corpus-based model (Klavan and Divjak 2016).…”

Section: Introductionmentioning

confidence: 99%

Are corpus-based predictions mirrored in the preferential choices and ratings of native speakers? Predicting the alternation between the Estonian adessive case and the adposition <i>peal</i> ‘on’

Klavan

Veismann

2017

ESUKA-JEFUL

Self Cite

View full text Add to dashboard Cite

Abstract. Recent work in usage-based linguistics stresses the importance of combining corpus-based analyses with experimental studies. A number of studies have compared the performance of a corpus-based statistical model against the behaviour of native speakers in a linguistic experiment. The present paper takes this line of analysis further by combining corpus-based work with two sources of experimental data. A mixedeffects logistic regression model is fitted to the corpus data of the Estonian adessive case and the adposition peal 'on' in present-day written Estonian. In order to evaluate the goodness of the corpus-based model, its performance is compared to the behaviour of native speakers in a forced choice task and a rating task.

show abstract

“…Third, although regression models have produced classification results that have received support from behavioural studies (for an overview of this relatively recent trend in linguistics see Klavan & Divjak, 2016), the algorithms these models rely on are not based on learning mechanisms but maximize likelihood using optimization techniques. Whether humans do or do not exhibit (near-)optimal behaviour remains a matter of debate (see Kahneman & Tversky, 1984).…”

Section: Introductionmentioning

confidence: 99%

Towards cognitively plausible data science in language research

Milin

Divjak

Dimitrijević

et al. 2016

Cognitive Linguistics

Self Cite

View full text Add to dashboard Cite

Over the past 10 years, Cognitive Linguistics has taken a Quantitative Turn. Yet, concerns have been raised that this preoccupation with quantification and modelling may not bring us any closer to understanding how language works. We show that this objection is unfounded, especially if we rely on modelling techniques based on biologically and psychologically plausible learning algorithms. These make it possible to take a quantitative approach, while generating and testing specific hypotheses that will advance our understanding of how knowledge of language emerges from exposure to usage.

show abstract

The cognitive plausibility of statistical classification models: Comparing textual and behavioral evidence

Cited by 45 publications

References 40 publications

General introduction: A comparative perspective on probabilistic variation in grammar

General introduction: A comparative perspective on probabilistic variation in grammar

Are corpus-based predictions mirrored in the preferential choices and ratings of native speakers? Predicting the alternation between the Estonian adessive case and the adposition <i>peal</i> ‘on’

Towards cognitively plausible data science in language research

Contact Info

Product

Resources

About