René Arnulfo García-Hernández scite author profile

Abstract. Automatic text summarization helps the user to quickly understand large volumes of information. We present a language-and domain-independent statistical-based method for single-document extractive summarization, i.e., to produce a text summary by extracting some sentences from the given text. We show experimentally that words that are parts of bigrams that repeat more than once in the text are good terms to describe the text's contents, and so are also so-called maximal frequent sentences. We also show that the frequency of the term as term weight gives good results (while we only count the occurrences of a term in repeating bigrams).

show abstract

A New Algorithm for Fast Discovery of Maximal Sequential Patterns in a Document Collection

García-Hernández

Martínez-Trinidad

Carrasco-Ochoa

2006

View full text Add to dashboard Cite

Word Sequence Models for Single Text Summarization

García-Hernández¹,

Ledeneva²

2009

View full text Add to dashboard Cite

Text Summarization by Sentence Extraction Using Unsupervised Learning

García-Hernández¹,

Montiel²,

Ledeneva³

et al. 2008

View full text Add to dashboard Cite

Abstract. The main problem for generating an extractive automatic text summary is to detect the most relevant information in the source document. Although, some approaches claim being domain and language independent, they use high dependence knowledge like key-phrases or golden samples for machine-learning approaches. In this work, we propose a language-and domain-independent automatic text summarization approach by sentence extraction using an unsupervised learning algorithm. Our hypothesis is that an unsupervised algorithm can help for clustering similar ideas (sentences). Then, for composing the summary, the most representative sentence is selected from each cluster. Several experiments in the standard DUC-2002 collection show that the proposed method obtains more favorable results than other approaches.

show abstract

Sentence features relevance for extractive text summarization using genetic algorithms

Vázquez

García-Hernández

Ledeneva

2018

IFS

View full text Add to dashboard Cite

A Fast Algorithm to Find All the Maximal Frequent Sequences in a Text

García-Hernández

Martínez-Trinidad

Carrasco-Ochoa

2004

View full text Add to dashboard Cite

Abstract. One of the sequential pattern mining problems is to find the maximal frequent sequences in a database with a β support. In this paper, we propose a new algorithm to find all the maximal frequent sequences in a text instead of a database. Our algorithm in comparison with the typical sequential pattern mining algorithms avoids the joining, pruning and text scanning steps. Some experiments have shown that it is possible to get all the maximal frequent sequences in a few seconds for medium texts.

show abstract

Graph Ranking on Maximal Frequent Sequences for Single Extractive Text Summarization

Ledeneva

García-Hernández

Gelbukh

2014

View full text Add to dashboard Cite

Abstract. We suggest a new method for the task of extractive text summarization using graph-based ranking algorithms. The main idea of this paper is to rank Maximal Frequent Sequences (MFS) in order to identify the most important information in a text. MFS are considered as nodes of a graph in term selection step, and then are ranked in term weighting step using a graphbased algorithm. We show that the proposed method produces results superior to the-state-of-the-art methods; in addition, the best sentences were found with this method. We prove that MFS are better than other terms. Moreover, we show that the longer is MFS, the better are the results. If the stop-words are excluded, we lose the sense of MFS, and the results are worse. Other important aspect of this method is that it does not require deep linguistic knowledge, nor domain or language specific annotated corpora, which makes it highly portable to other domains, genres, and languages.

show abstract

Extractive Automatic Text Summarization Based on Lexical-Semantic Keywords

et al. 2020

View full text Add to dashboard Cite

The automatic text summarization (ATS) task consists in automatically synthesizing a document to provide a condensed version of it. Creating a summary requires not only selecting the main topics of the sentences but also identifying the key relationships between these topics. Related works rank text units (mainly sentences) to select those that could form the summary. However, the resulting summaries may not include all the topics covered in the source text because important information may have been discarded. In addition, the semantic structure of documents has been barely explored in this field. Thus, this study proposes a new method for the ATS task that takes advantage of semantic information to improve keyword detection. This proposed method increases not only the coverage by clustering the sentences to identify the main topics in the source document but also the precision by detecting the keywords in the clusters. The experimental results of this work indicate that the proposed method outperformed previous methods with a standard collection.

show abstract

12 3 4 5 6 7

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.