Boxer is an open-domain software component for semantic analysis of text, based on Combinatory Categorial Grammar (CCG) and Discourse Representation Theory (DRT). Used together with the C&C tools, Boxer reaches more than 95% coverage on newswire texts. The semantic representations produced by Boxer, known as Discourse Representation Structures (DRSs), incorporate a neo-Davidsonian representations for events, using the VerbNet inventory of thematic roles. The resulting DRSs can be translated to ordinary first-order logic formulas and be processing by standard theorem provers for first-order logic. Boxer's performance on the shared task for comparing semantic represtations was promising. It was able to produce complete DRSs for all seven texts. Manually inspecting the output revealed that: (a) the computed predicate argument structure was generally of high quality, in particular dealing with hard constructions involving control or coordination; (b) discourse structure triggered by conditionals, negation or discourse adverbs was overall correctly computed; (c) some measure and time expressions are correctly analysed, others aren't; (d) several shallow analyses are given for lexical phrases that require deep analysis; (e) bridging references and pronouns are not resolved in most cases. Boxer is distributed with the C&C tools and freely available for research purposes.
We use logical inference techniques for recognising textual entailment. As the performance of theorem proving turns out to be highly dependent on not readily available background knowledge, we incorporate model building, a technique borrowed from automated reasoning, and show that it is a useful robust method to approximate entailment. Finally, we use machine learning to combine these deep semantic analysis techniques with simple shallow word overlap; the resulting hybrid model achieves high accuracy on the RTE testset, given the state of the art. Our results also show that the different techniques that we employ perform very differently on some of the subsets of the RTE corpus and as a result, it is useful to use the nature of the dataset as a feature.
This paper shows how to construct semantic representations from the derivations produced by a wide-coverage CCG parser. Unlike the dependency structures returned by the parser itself, these can be used directly for semantic interpretation. We demonstrate that well-formed semantic representations can be produced for over 97% of the sentences in unseen WSJ text. We believe this is a major step towards widecoverage semantic interpretation, one of the key objectives of the field of NLP.
What would be a good method to provide a large collection of semantically annotated texts with formal, deep semantics rather than shallow? In this talk I will argue that (i) a bootstrapping approach comprising state-of-the-art NLP tools for semantic parsing, in combination with (ii) a wiki-like interface for collaborative annotation of experts, and (iii) a game with a purpose for crowdsourcing, are the starting ingredients for fulfilling this enterprise. The result, known as the Groningen Meaning Bank, is a semantic resource that anyone can edit and that integrates various semantic phenomena, including predicate-argument structure, scope, tense, thematic roles, animacy, pronouns, and rhetorical relations. A single semantic formalism, Discourse Representation Theory, embraces all these phenonema by taking meaning representations of texts rather than sentences as the units of annotation.
Shared Task 1 of SemEval-2014 comprised two subtasks on the same dataset of sentence pairs: recognizing textual entailment and determining textual similarity. We used an existing system based on formal semantics and logical inference to participate in the first subtask, reaching an accuracy of 82%, ranking in the top 5 of more than twenty participating systems. For determining semantic similarity we took a supervised approach using a variety of features, the majority of which was produced by our system for recognizing textual entailment. In this subtask our system achieved a mean squared error of 0.322, the best of all participating systems.
Neural methods have had several recent successes in semantic parsing, though they have yet to face the challenge of producing meaning representations based on formal semantics. We present a sequenceto-sequence neural semantic parser that is able to produce Discourse Representation Structures (DRSs) for English sentences with high accuracy, outperforming traditional DRS parsers. To facilitate the learning of the output, we represent DRSs as a sequence of flat clauses and introduce a method to verify that produced DRSs are well-formed and interpretable. We compare models using characters and words as input and see (somewhat surprisingly) that the former performs better than the latter. We show that eliminating variable names from the output using De Bruijn-indices increases parser performance. Adding silver training data boosts performance even further.
While knowledge about symptom presentation of adults with mild ASD, including comorbid psychopathology, is limited, referral of adults with suspected mild PDD is increasing. We report on pilot research investigating whether patients diagnosed with mild ASD (n = 15) and patients who were not diagnosed with ASD (n = 21) differed in terms of (a) AQ scores and (b) Axis I and II disorders, assessed by the SCAN and the IPDE. Additionally, AQ scores were compared with those from non-ASD patients referred to a general outpatient clinic (n = 369). The results showed very few differences between ASD patients and non-ASD patients. Self-report may not differentiate mild ASD patients from non-ASD patients and Axis I and II disorders seem equally prevalent among these two groups.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.