Universal dependencies (UD) is a framework for morphosyntactic annotation of human language, which to date has been used to create treebanks for more than 100 languages. In this article, we outline the linguistic theory of the UD framework, which draws on a long tradition of typologically oriented grammatical theories. Grammatical relations between words are centrally used to explain how predicate–argument structures are encoded morphosyntactically in different languages while morphological features and part-of-speech classes give the properties of words. We argue that this theory is a good basis for cross-linguistically consistent annotation of typologically diverse languages in a way that supports computational natural language understanding as well as broader linguistic studies.
Syllabification does not seem to improve word-level RNN language modeling quality when compared to characterbased segmentation. However, our best syllable-aware language model, achieving performance comparable to the competitive character-aware model, has 18%-33% fewer parameters and is trained 1.2-2.2 times faster.
In this paper, we argue against the primary categories of non-finite verb used in the Turkology literature: “participle” (причастие ‹pričastije›) and “converb” (деепричастие ‹dejepričastije›). We argue that both of these terms conflate several discrete phenomena, and that they furthermore are not coherent as umbrella terms for these phenomena. Based on detailed study of the non-finite verb morphology and syntax of a wide range of Turkic languages (presented here are Turkish, Kazakh, Kyrgyz, Tatar, Tuvan, and Sakha), we instead propose delineation of these categories according to their morphological and syntactic properties. Specifically, we propose that more accurate categories are verbal noun, verbal adjective, verbal adverb, and infinitive. This approach has far-reaching implications to the study of syntactic phenomena in Turkic languages, including phenomena ranging from relative clauses to clause chaining.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.