The Semantic Link Network is a general semantic model for modeling the structure and the evolution of complex systems. Various semantic links play different roles in rendering the semantics of complex system. One of the basic semantic links represents cause-effect relation, which plays an important role in representation and understanding. This paper verifies the role of the Semantic Link Network in representing the core of text by investigating the contribution of cause-effect link to representing the core of scientific papers. Research carries out with the following steps: (1) Two propositions on the contribution of cause-effect link in rendering the core of paper are proposed and verified through a statistical survey, which shows that the sentences on cause-effect links cover about 65% of key words within each paper on average. (2) An algorithm based on syntactic patterns is designed for automatically extracting cause-effect link from scientific papers, which recalls about 70% of manually annotated cause-effect links on average, indicating that the result adapts to the scale of data sets. (3) The effects of cause-effect link on four schemes of incorporating cause-effect link into the existing instances of the Semantic Link Network for enhancing the summarization of scientific papers are investigated. The experiments show that the quality of the summaries is significantly improved, which verifies the role of semantic links. The significance of this research lies in two aspects: (1) it verifies that the Semantic Link Network connects the important concepts to render the core of text; and, (2) it provides an evidence for realizing content services such as summarization, recommendation and question answering based on the Semantic Link Network, and it can inspire relevant research on content computing.
Most existing methods for extractive text summarization aim to extract important sentences with statistical or linguistic techniques and concatenate these sentences as a summary. However, the extracted sentences are usually incoherent. The problem becomes worse when the source text and the summary are long and based on logical reasoning. The motivation of this paper is to answer the following two related questions: What is the best language unit for constructing a summary that is coherent and understandable? How is the extractive summarization process based on the language unit? Extracting larger language units such as a group of sentences or a paragraph is a natural way to improve the readability of summary as it is rational to assume that the original sentences within a larger language unit are coherent. This paper proposes a framework for group-based text summarization that clusters semantically related sentences into groups based on Semantic Link Network (SLN) and then ranks the groups and concatenates the top-ranked ones into a summary. A two-layer SLN model is used to generate and rank groups with semantic links including the is-part-of link, sequential link, similar-to link, and cause-effect link. The experimental results show that summaries composed by group or paragraph tend to contain more key words or phrases than summaries composed by sentences; and, summaries composed by groups contain more key words or phrases than those composed by paragraphs, especially when the average length of source texts is from 7,000 words to 17,000 words, which is the usual length of scientific papers. Further, we compare seven clustering algorithms for generating groups and propose five strategies for generating groups with the four types of semantic links.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.