Studies of Flow and Heat Transfer Associated with a Rotating Disc

The present paper discusses the benefits and challenges of token-based typology, which takes into account the frequencies of words and constructions in language use. This approach makes it possible to introduce new criteria for language classification, which would be difficult or impossible to achieve with the traditional, type-based approach. This point is illustrated by several quantitative studies of word order variation, which can be measured as entropy at different levels of granularity. I argue that this variation can be explained by general functional mechanisms and pressures, which manifest themselves in language use, such as optimization of processing (including avoidance of ambiguity) and grammaticalization of predictable units occurring in chunks. The case studies are based on multilingual corpora, which have been parsed using the Universal Dependencies annotation scheme.

show abstract

2. A radically data-driven Construction Grammar: Experiments with Dutch causative constructions

Levshina¹,

Heylen²

2014

View full text Add to dashboard Cite

Conditional Inference Trees and Random Forests

Levshina

2020

View full text Add to dashboard Cite

This chapter discusses popular non-parametric methods in corpus linguistics: conditional inference trees and conditional random forests. These methods, which allow the researcher to model and interpret the relationships between a numeric or categorical response variable and various predictors, are particularly attractive in 'tricky' situations, when the use of parametric methods (in particular, regression models) can be problematic, for example, in the situations of 'small n, large p', complex interactions, non-linearity and correlated predictors. For illustration, the chapter discusses a case study of T and V politeness forms in Russian based on a corpus of film subtitles.

show abstract

A Multivariate Study of T/V Forms in European Languages Based on a Parallel Corpus of Film Subtitles

Levshina

2017

RiL

View full text Add to dashboard Cite

Abstract The present study investigates the cross-linguistic differences in the use of so-called T/V forms (e.g. French tu and vous, German du and Sie, Russian ty and vy) in ten European languages from different language families and genera. These constraints represent an elusive object of investigation because they depend on a large number of subtle contextual features and social distinctions, which should be cross-linguistically matched. Film subtitles in different languages offer a convenient solution because the situations of communication between film characters can serve as comparative concepts. I selected more than two hundred contexts that contain the pronouns you and yourself in the original English versions, which are then coded for fifteen contextual variables that describe the Speaker and the Hearer, their relationships and different situational properties. The creators of subtitles in the other languages have to choose between T and V when translating from English, where the T/V distinction is not expressed grammatically. On the basis of these situations translated in ten languages, I perform multivariate analyses using the method of conditional inference trees in order to identify the most relevant contextual variables that constrain the T/V variation in each language.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Natalia Levshina

How to do Linguistics with R

Token-based typology and word order entropy: A study based on Universal Dependencies

2. A radically data-driven Construction Grammar: Experiments with Dutch causative constructions

Conditional Inference Trees and Random Forests

A Multivariate Study of T/V Forms in European Languages Based on a Parallel Corpus of Film Subtitles

Contact Info

Product

Resources

About