A. Seza Doğruöz scite author profile

Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of 'Computational Sociolinguistics' that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.

show abstract

Innovative constructions in Dutch Turkish: An assessment of ongoing contact-induced change

Doğruöz

Backus

2009

Bilingualism

View full text Add to dashboard Cite

Turkish as spoken in the Netherlands (NL-Turkish) sounds “different” (unconventional) to Turkish speakers in Turkey (TR-Turkish). We claim that this is due to structural contact-induced change that is, however, located within specific lexically complex units copied from Dutch. This article investigates structural change in NL-Turkish through analyses of spoken corpora collected in the bilingual Turkish community in the Netherlands and in a monolingual community in Turkey. The analyses reveal that at the current stage of contact, NL-Turkish is not copying Dutch syntax as such, but rather translates lexically complex individual units into Turkish. Perceived semantic equivalence between Dutch units and their Turkish equivalents plays a crucial role in this translation process. Counter to expectations, the TR-Turkish data also contained unconventional units, though they differed in type, and were much less frequent than those in NL-Turkish. We conclude that synchronic variation in individual NL-Turkish units can contain the seeds of future syntactic change, which will only be visible after an increase in the type and token frequency of the changing units.

show abstract

Predicting Code-switching in Multilingual Communication for Immigrant Communities

Papalexakis¹,

Nguyen²,

Doğruöz³

2014

View full text Add to dashboard Cite

Immigrant communities host multilingual speakers who switch across languages and cultures in their daily communication practices. Although there are in-depth linguistic descriptions of code-switching across different multilingual communication settings, there is a need for automatic prediction of code-switching in large datasets. We use emoticons and multi-word expressions as novel features to predict code-switching in a large online discussion forum for the Turkish-Dutch immigrant community in the Netherlands. Our results indicate that multi-word expressions are powerful features to predict code-switching.

show abstract

Postverbal elements in immigrant Turkish: Evidence of change?

Doğruöz

Backus

2007

International Journal of Bilingualism

View full text Add to dashboard Cite

Contact between languages usually leads to linguistic changes. Both social and structural factors are claimed to influence this process. This study analyzes word order in Turkish as spoken in the Netherlands (NL-Turkish). Turkish is an OV language but also allows other word order patterns (including VO) in certain pragmatic contexts. Dutch, on the other hand, is VO in main clauses. Due to contact, Turkish may be expected to increase its use of VO. From a comparison with Turkish as spoken in Turkey (TR-Turkish), it appeared that there is no increase of VO in NL-Turkish. However, we did find some deviations in the information structure characteristics of VO structures and sometimes these seem to be due to Dutch influence. On the other hand, TR-Turkish data also contained certain types of VO structures that further caution against hasty contact conclusions. We conclude that contact situations need to be intense for sweeping syntactic change to occur, and that such change starts with changes in individual semilexical constructions.

show abstract

Salient stages in contact-induced grammatical change: Evidence from synchronic vs. diachronic contact situations

2011

View full text Add to dashboard Cite

Resources for Turkish natural language processing: A critical survey

Çöltekin

Doğruöz

Çetinoğlu

2022

Lang Resources & Evaluation

View full text Add to dashboard Cite

This paper presents a comprehensive survey of corpora and lexical resources available for Turkish. We review a broad range of resources, focusing on the ones that are publicly available. In addition to providing information about the available linguistic resources, we present a set of recommendations, and identify gaps in the data available for conducting research and building applications in Turkish Linguistics and Natural Language Processing.

show abstract

Modeling the Use of Graffiti Style Features to Signal Social Relations within a Multi-Domain Learning Paradigm

Piergallini¹,

Doğruöz

Gadde

et al. 2014

View full text Add to dashboard Cite

In this paper, we present a series of experiments in which we analyze the usage of graffiti style features for signaling personal gang identification in a large, online street gangs forum, with an accuracy as high as 83% at the gang alliance level and 72% for the specific gang. We then build on that result in predicting how members of different gangs signal the relationship between their gangs within threads where they are interacting with one another, with a predictive accuracy as high as 66% at this thread composition prediction task. Our work demonstrates how graffiti style features signal social identity both in terms of personal group affiliation and between group alliances and oppositions. When we predict thread composition by modeling identity and relationship simultaneously using a multi-domain learning framework paired with a rich feature representation, we achieve significantly higher predictive accuracy than state-of-the-art baselines using one or the other in isolation.

show abstract

Spread of on-going changes in an immigrant language

Doğruöz¹,

Gries²

2014

View full text Add to dashboard Cite

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

A. Seza Doğruöz

Computational Sociolinguistics: A Survey

Innovative constructions in Dutch Turkish: An assessment of ongoing contact-induced change

Predicting Code-switching in Multilingual Communication for Immigrant Communities

Postverbal elements in immigrant Turkish: Evidence of change?

Salient stages in contact-induced grammatical change: Evidence from synchronic vs. diachronic contact situations

Resources for Turkish natural language processing: A critical survey

Modeling the Use of Graffiti Style Features to Signal Social Relations within a Multi-Domain Learning Paradigm

Spread of on-going changes in an immigrant language

Contact Info

Product

Resources

About