Contrary to common assumptions, syntactic relations, especially those of subject and object, are not universal, but are only one of sevcralpossibilities of organising relational clause structure. The three main dimensions of relational structuring are those of semantic roles, Information flow, and deictic anchoring. There are three major language types depending on the extent to which these dimensions are grammaticalised: "pivotless" languages, with no or little grammaticalisation of any of these dimensions; "pure" languages, strongly grammaticalising only one of them, especially that ofroles; "mixed" languages, strongly grammaticalising more than one. Genuinely syntactic relations, further differentiated in terms of their alignments (such äs accusative, ergative, active, tripartite), then resultfrom the cumulative encoding ofrole and flow distinctions in the mixed type.
About 130 languages are currently spoken in the USSR. These languages differ considerably in their numbers of speakers, social status, scope and viability. Our primary interest in this paper will be with those languages that are in extreme danger of extinction in the near future.
We report a study of referential choice in discourse production, understood as the choice between various types of referential devices, such as pronouns and full noun phrases. Our goal is to predict referential choice, and to explore to what extent such prediction is possible. Our approach to referential choice includes a cognitively informed theoretical component, corpus analysis, machine learning methods and experimentation with human participants. Machine learning algorithms make use of 25 factors, including referent’s properties (such as animacy and protagonism), the distance between a referential expression and its antecedent, the antecedent’s syntactic role, and so on. Having found the predictions of our algorithm to coincide with the original almost 90% of the time, we hypothesized that fully accurate prediction is not possible because, in many situations, more than one referential option is available. This hypothesis was supported by an experimental study, in which participants answered questions about either the original text in the corpus, or about a text modified in accordance with the algorithm’s prediction. Proportions of correct answers to these questions, as well as participants’ rating of the questions’ difficulty, suggested that divergences between the algorithm’s prediction and the original referential device in the corpus occur overwhelmingly in situations where the referential choice is not categorical.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.