Text summarization is a process of extracting salient information from a source text and presenting that information to the user in a condensed form while preserving its main content. In the text summarization, most of the difficult problems are providing wide topic coverage and diversity in a summary. Research based on clustering, optimization, and evolutionary algorithm for text summarization has recently shown good results, making this a promising area. In this paper, for a text summarization, a two‐stage sentences selection model based on clustering and optimization techniques, called COSUM, is proposed. At the first stage, to discover all topics in a text, the sentences set is clustered by using k‐means method. At the second stage, for selection of salient sentences from clusters, an optimization model is proposed. This model optimizes an objective function that expressed as a harmonic mean of the objective functions enforcing the coverage and diversity of the selected sentences in the summary. To provide readability of a summary, this model also controls length of sentences selected in the candidate summary. For solving the optimization problem, an adaptive differential evolution algorithm with novel mutation strategy is developed. The method COSUM was compared with the 14 state‐of‐the‐art methods: DPSO‐EDASum; LexRank; CollabSum; UnifiedRank; 0–1 non‐linear; query, cluster, summarize; support vector machine; fuzzy evolutionary optimization model; conditional random fields; MA‐SingleDocSum; NetSum; manifold ranking; ESDS‐GHS‐GLO; and differential evolution, using ROUGE tool kit on the DUC2001 and DUC2002 data sets. Experimental results demonstrated that COSUM outperforms the state‐of‐the‐art methods in terms of ROUGE‐1 and ROUGE‐2 measures.
BackgroundSummarization is a process to select important information from a source text. Summarizing strategies are the core cognitive processes in summarization activity. Since summarization can be important as a tool to improve comprehension, it has attracted interest of teachers for teaching summary writing through direct instruction. To do this, they need to review and assess the students' summaries and these tasks are very time-consuming. Thus, a computer-assisted assessment can be used to help teachers to conduct this task more effectively.Design/ResultsThis paper aims to propose an algorithm based on the combination of semantic relations between words and their syntactic composition to identify summarizing strategies employed by students in summary writing. An innovative aspect of our algorithm lies in its ability to identify summarizing strategies at the syntactic and semantic levels. The efficiency of the algorithm is measured in terms of Precision, Recall and F-measure. We then implemented the algorithm for the automated summarization assessment system that can be used to identify the summarizing strategies used by students in summary writing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.