Abstract:In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments desi… Show more
“…It relies only on word probability to calculate importance [212]. For each sentence S j in the input it assigns a weight equal to the average probability p(w i ) of the content words in the sentence, estimated from the input for summarization:…”
Section: Methods Based On Word Frequencymentioning
“…It relies only on word probability to calculate importance [212]. For each sentence S j in the input it assigns a weight equal to the average probability p(w i ) of the content words in the sentence, estimated from the input for summarization:…”
Section: Methods Based On Word Frequencymentioning
“…Vanderwende et al [19] proposed the SumBasic system which uses only the word probability approach to determine sentence importance. For each sentence, S j , in the input, it assigns a weight equal to the average probability of the words in the sentence:…”
Abstract-In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. Text summarization is the task of shortening a text document into a condensed version keeping all the important information and content of the original document. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.
“…The advantage of this approach is that it provides the necessary flexibility to accommodate complex interactions between relevance and redundancy that cannot be captured in a single compression. Downstream processes that have access to more information are capable of making better decisions on the choice of a final compression; this approach is also espoused by Vanderwende et al (2006).…”
We present two approaches to email thread summarization: Collective Message Summarization (CMS) applies a multi-document summarization approach, while Individual Message Summarization (IMS) treats the problem as a sequence of single-document summarization tasks. Both approaches are implemented in our general framework driven by sentence compression. Instead of a purely extractive approach, we employ linguistic and statistical methods to generate multiple compressions, and then select from those candidates to produce a final summary. We demonstrate these ideas on the Enron collection-a very challenging corpus because of the highly technical language. Experimental results point to two findings: that CMS represents a better approach to email thread summarization, and that current sentence compression techniques do not improve summarization performance in this genre.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.