We propose a general approach for performing event coreference and for constructing complex event representations, such as those required for information extraction tasks. Our approach is based on a representation which allows a tight coupling between world or conceptual modelling and discourse modelling. The representation and the coreference mechanism are fully implemented within the LaSIE information extraction system where the mechanism is used for both object (noun phrase) and event coreference resolution. Indirect evaluation of the approach shows small, but significant benefit, for information extraction tasks.
We describe the use of coreference chains for the production of text summaries, using a variety of criteria to select a 'best' chain to represent the main topic of a text. The approach has been implemented within an existing MUC coreference system, which constructs a full discourse model of texts, including information about changes of focus, which can be used in the selection of chains. Some preliminary experiments on the automatic evaluation of summaries are also described, using existing tools to attempt to replicate some of the recent SUMMAC manual evaluations.
The paper presents a study on large-scale automatic extraction of acronyms and associated expansions from Web data and from the user interactions with this data through Web search engines. We investigate three information sources for extracting and ranking acronym-expansion pairs, as provided by a large-scale search engine: the crawled web documents, the search engine logs, and the search results. We evaluate and compare the acronymexpansion pairs generated from these sources on three dimensions: (1) the precision and recall of each source; (2) the overlap and inclusion among the acronym-expansion sets; and (3) the rank-order correlation of the ordered expansion sets. Our results show that all three data sources play an important role in building a comprehensive up-todate collection of acronym-expansion pairs.
We propose an algorithm to resolve anaphors, tackling mainly the problem of intrasentential antecedents. We base our methodology on the fact that such antecedents are likely to occur in embedded sentences. Sidner's focusing mechanism is used as the basic algorithm in a more complete approach. The proposed algorithm has been tested and implemented as a part of a conceptual analyser, mainly to process pronouns. Details of an evaluation are given.
We present an approach to anaphora resolution based on a focusing algorithm, and implemented within an existing MUC (Message Understanding Conference) Information Extraction system, allowing quantitative evaluation against a substantial corpus of annotated real-world texts. Extensions to the basic focusing mechanism can be easily tested, resulting in refinements to the mechanism and resolution rules. Results show that the focusing algorithm is highly sensitive to the quality of syntactic-semantic analyses, when compared to a simpler heuristic-based approach.1 This work was carried out in the context of the EU AVENTINUS project (Thurmair, 1996), which aims to develop a multilingual IE system for drug enforcement, and including a language-independent coreference mechanism .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.