Grammatical agreement means that features associated with one linguistic unit (for example number or gender) become associated with another unit and then possibly overtly expressed, typically with morphological markers. It is one of the key mechanisms used in many languages to show that certain linguistic units within an utterance grammatically depend on each other. Agreement systems are puzzling because they can be highly complex in terms of what features they use and how they are expressed. Moreover, agreement systems have undergone considerable change in the historical evolution of languages. This article presents language game models with populations of agents in order to find out for what reasons and by what cultural processes and cognitive strategies agreement systems arise. It demonstrates that agreement systems are motivated by the need to minimize combinatorial search and semantic ambiguity, and it shows, for the first time, that once a population of agents adopts a strategy to invent, acquire and coordinate meaningful markers through social learning, linguistic self-organization leads to the spontaneous emergence and cultural transmission of an agreement system. The article also demonstrates how attested grammaticalization phenomena, such as phonetic reduction and conventionalized use of agreement markers, happens as a side effect of additional economizing principles, in particular minimization of articulatory effort and reduction of the marker inventory. More generally, the article illustrates a novel approach for studying how key features of human languages might emerge.
Social media house a trove of relevant information for the study of online opinion dynamics. However, harvesting and analyzing the sheer overload of data that is produced by these media poses immense challenges to journalists, researchers, activists, policy makers, and concerned citizens. To mitigate this situation, this article discusses the creation of (social) media observatories: platforms that enable users to capture the complexities of social behavior, in particular the alignment and misalignment of opinions, through computational analyses of digital media data. The article positions the concept of “observatories” for social media monitoring among ongoing methodological developments in the computational social sciences and humanities and proceeds to discuss the technological innovations and design choices behind social media observatories currently under development for the study of opinions related to cultural and societal issues in European spaces. Notable attention is devoted to the construction of Penelope: an open, web-services-based infrastructure that allows different user groups to consult and contribute digital tools and observatories that suit their analytical needs. The potential and the limitations of this approach are discussed on the basis of a climate change opinion observatory that implements text analysis tools to study opinion dynamics concerning themes such as global warming. Throughout, the article explicitly acknowledges and addresses potential risks of the machine-guided and human-incentivized study of opinion dynamics. Concluding remarks are devoted to a synthesis of the ethical and epistemological implications of the exercise of positioning observatories in contemporary information spaces and to an examination of future pathways for the development of social media observatories.
This paper introduces a novel methodology for extracting semantic frames from text corpora. Building on recent advances in computational construction grammar, the method captures expert knowledge of how semantic frames can be expressed in the form of conventionalised form-meaning pairings, called constructions. By combining these constructions in a semantic parsing process, the frame-semantic structure of a sentence is retrieved through the intermediary of its morpho-syntactic structure. The main advantage of this approach is that state-of-the-art results are achieved, without the need for annotated training data. We demonstrate the method in a case study where causation frames are extracted from English newspaper articles, and compare it to a commonly used approach based on Conditional Random Fields (CRFs). The computational construction grammar approach yields a word-level F1 score of 78.5%, outperforming the CRF approach by 4.5 percentage points.
In order to be able to answer a natural language question, a computational system needs three main capabilities. First, the system needs to be able to analyze the question into a structured query, revealing its component parts and how these are combined. Second, it needs to have access to relevant knowledge sources, such as databases, texts or images. Third, it needs to be able to execute the query on these knowledge sources. This paper focuses on the first capability, presenting a novel approach to semantically parsing questions expressed in natural language. The method makes use of a computational construction grammar model for mapping questions onto their executable semantic representations. We demonstrate and evaluate the methodology on the CLEVR visual question answering benchmark task. Our system achieves a 100% accuracy, effectively solving the language understanding part of the benchmark task. Additionally, we demonstrate how this solution can be embedded in a full visual question answering system, in which a question is answered by executing its semantic representation on an image. The main advantages of the approach include (i) its transparent and interpretable properties, (ii) its extensibility, and (iii) the fact that the method does not rely on any annotated training data.
Computational construction grammar aims to provide concrete processing models that operationalise construction grammar accounts of the different aspects of language. This paper discusses the computational mechanisms that allow construction grammar models to exhibit, to a certain extent, the creativity and inventiveness that is observed in human language use. It addresses two main types of language-related creativity. The first type concerns the ‘free combination of constructions,’ which gives rise to the open-endedness of language. The second type concerns the ‘appropriate violation of usual constraints’ that permits language users to go beyond what is possible when adhering to the usual constraints of the language, and be truly creative by relaxing these constraints and by introducing novel constructions. All mechanisms and examples discussed in this paper are fully operationalised and implemented in Fluid Construction Grammar.
Abstract-This paper discusses lexicon word learning in highdimensional meaning spaces from the viewpoint of referential uncertainty. We investigate various state-of-the-art Machine Learning algorithms and discuss the impact of scaling, representation and meaning space structure. We demonstrate that current Machine Learning techniques successfully deal with high-dimensional meaning spaces. In particular, we show that exponentially increasing dimensions linearly impact learner performance and that referential uncertainty from word sensitivity has no impact.
Grammatical agreement means that two linguistic units share certain syntactic or semantic features such as gender, number or person. Agreement has a variety of grammatical functions. One of them, called internal agreement, is to signal which words are grouped together as part of the same phrase. This chapter explores how a population might self-organize such an agreement system. We argue that this happens when speakers attempt to reduce processing effort and avoid ambiguities.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers