This paper describes a method to convert existing treebanks with syntactic information into banks of meaning representations. The central component is a system of evaluation for a small formal language with respect to an information state. Inputs to the evaluation system are formal language expressions obtained from the conversion of parsed representations conforming to (Penn Treebank Project) guidelines. Outputs from the evaluation system are Davidsonian (higher-order) predicate logic meaning representations. Having a system of evaluation as the basis for generating meaning representations makes possible accepting input with minimal conversion from existing treebanks and from the tools used to construct treebanks. Results of having built corresponding banks of meaning representations from available treebanks are discussed.
Over the last few decades, corpora with comprehensive syntactic annotation, known as treebanks or parsed corpora, have been created in various formats for major languages of the world (e.g., Sampson (1995), Bies et al. (1995), Chen et al. (1999), TIGER (2003, NPCMJ (2016), etc.). As modes of accessing annotation have become more linguistically sophisticated, so these corpus resources have become more relevant for linguistics in general by providing sources of insight into factors that only become visible through analysis generalized over structures: phenomena in co-occurrence, frequency, constituency, embeddability, scope, agreement, dependency, etc. These insights are spurring new research and refinements in both corpus techniques and theoretical understanding. While much research has concentrated on challenges inherent in the creation as well as correction of annotated corpora (e.g., Dickinson and Meurers (2003), Hovy and Lavid (2010), Kulick et al. (2013), etc.), with the availability of digitized data on a large scale and the production of parsed corpora as available resources, new challenges have opened up for making use of corpus-building technologies and the resulting data in subsequent research. Examples include linking corpora to external resources like lexical databases, abstracting the contents sufficiently to be of use to non-experts, exploration of crosslinguistic patterns, etc. This special issue consists of five articles focused on applying parsed corpora research in three areas: (I) enrichment and
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.