Abstract:This paper proposes an extractive generic text summarization model that generates summaries by selecting sentences according to their scores. Sentence scores are calculated using their extensive coverage of the main content of the text, and summaries are created by extracting the highest scored sentences from the original document. The model formalized as a multiobjective integer programming problem. An advantage of this model is that it can cover the main content of source (s) and provide less redundancy in t… Show more
“…Various techniques like graph-based methods [6,15,16], artificial neural networks [22] and deep learning based approaches [18,20,29] have been developed for text summarization. Integer linear programming (ILP) has also shown promising results in extractive document summarization [1,9]. Duan et al [5] proposed a joint-ILP framework that produces summaries from temporally separate text documents.…”
Automatically generating a summary for asynchronous data can help users to keep up with the rapid growth of multi-modal information on the Internet. However, the current multi-modal systems usually generate summaries composed of text and images. In this paper, we propose a novel research problem of text-image-video summary generation (TIVS). We first develop a multi-modal dataset containing text documents, images and videos. We then propose a novel joint integer linear programming multi-modal summarization (JILP-MMS) framework. We report the performance of our model on the developed dataset.
“…Various techniques like graph-based methods [6,15,16], artificial neural networks [22] and deep learning based approaches [18,20,29] have been developed for text summarization. Integer linear programming (ILP) has also shown promising results in extractive document summarization [1,9]. Duan et al [5] proposed a joint-ILP framework that produces summaries from temporally separate text documents.…”
Automatically generating a summary for asynchronous data can help users to keep up with the rapid growth of multi-modal information on the Internet. However, the current multi-modal systems usually generate summaries composed of text and images. In this paper, we propose a novel research problem of text-image-video summary generation (TIVS). We first develop a multi-modal dataset containing text documents, images and videos. We then propose a novel joint integer linear programming multi-modal summarization (JILP-MMS) framework. We report the performance of our model on the developed dataset.
“…In this paper, the objective functions we defined as a weighted linear combination and a weighted harmonic mean of the coverage and redundancy objectives. We note that these combinations, with tuning of the weighting parameters, allow getting the best summary. In Alguliev et al (2010), content coverage of each sentence is defined by its similarity to the center of sentence collection and in Aguliev et al (2011) it is defined by the similarity of sentence to the whole document collection. It is known (Aliguliyev, 2010) that the similarity measure plays an important role in text summarization.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Here a weighting parameter specifies the relative contributions to the final information richness of sentences from the cosine and the NGD‐based measures. In this study, the content coverage of each sentence is defined by the sum of its similarity to the other sentences in collection. In Alguliev et al (2010) to solve the integer linear programming problem the GNU Linear Programming kit is used, which is a free optimization package (http://www.gnu.org/software/glpk/). In Alguliev et al (2011), the optimization problem is solved using PSO‐LDW (PSO with linearly decreasing inertia weight) algorithm.…”
Section: Literature Reviewmentioning
confidence: 99%
“…In Alguliev et al (2010), content coverage of each sentence is defined by its similarity to the center of sentence collection and in Aguliev et al (2011) it is defined by the similarity of sentence to the whole document collection. It is known (Aliguliyev, 2010) that the similarity measure plays an important role in text summarization.…”
Section: Literature Reviewmentioning
confidence: 99%
“…In Alguliev et al (2010) to solve the integer linear programming problem the GNU Linear Programming kit is used, which is a free optimization package (http://www.gnu.org/software/glpk/). In Alguliev et al (2011), the optimization problem is solved using PSO‐LDW (PSO with linearly decreasing inertia weight) algorithm.…”
In this paper, we have presented an optimization approach to document summarization. The potential of optimization based document summarization models has not been well explored to date. This is partially the difficulty to formulate the criteria used for objective assessment. We modeled document summarization as the linear and nonlinear optimization problems. These models generally attempt simultaneously to balance coverage and diversity in the summary. To solve the optimization problem we developed a novel particle swarm optimization (PSO) algorithm. Experiments showed our linear and nonlinear models produce very competitive results, which significantly outperform the NIST baselines in both years. More important, although linear and nonlinear models are comparable to the top three systems S24, S15, and S12 in the DUC2006, they are even superior to the best participating system in the DUC2005.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.