We consider the problem of producing a multi-document summary given a collection of documents. Since most successful methods of multi-document summarization are still largely extractive, in this paper, we explore just how well an extractive method can perform. We introduce an "oracle" score, based on the probability distribution of unigrams in human summaries. We then demonstrate that with the oracle score, we can generate extracts which score, on average, better than the human summaries, when evaluated with ROUGE. In addition, we introduce an approximation to the oracle score which produces a system with the best known performance for the 2005 Document Understanding Conference (DUC) evaluation.
Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system-the Query, Cluster, Summarize (QCS) system-which is portable, modular, and permits experimentation with different instantiations of each of the constituent text
Abstract. Automatic document summarization has become increasingly important due to the quantity of written material generated worldwide. Generating good quality summaries enables users to cope with larger amounts of information.English-document summarization is a difficult task. Yet it is not sufficient. Environmental, economic, and other global issues make it imperative for English speakers to understand how other countries and cultures perceive and react to important events.CLASSY (Clustering, Linguistics, And Statistics for Summarization Yield) is an automatic, extract-generating, summarization system that uses linguistic trimming and statistical methods to generate generic or topic(/query)-driven summaries for single documents or clusters of documents. CLASSY has performed well in the Document Understanding Conference (DUC) evaluations and the Multi-lingual (Arabic/English) Summarization Evaluations (MSE).We present a description of CLASSY. We follow this with experiments and results from the MSE evaluations and conclude with a discussion of on-going work to improve the quality of the summaries-both Englishonly and multi-lingual-that CLASSY generates.
An update summary should provide a fluent summarization of new information on a time-evolving topic, assuming that the reader has already reviewed older documents or summaries. In 2007 and 2008, an annual summarization evaluation included an update summarization task. Several participating systems produced update summaries indistinguishable from human-generated summaries when measured using ROUGE. However, no machine system performed near human-level performance in manual evaluations such as pyramid and overall responsiveness scoring. We present a metric called Nouveau-ROUGE that improves correlation with manual evaluation metrics and can be used to predict both the pyramid score and overall responsiveness for update summaries. Nouveau-ROUGE can serve as a less expensive surrogate for manual evaluations when comparing existing systems and when developing new ones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.