Ontology-based Extractive Text Summarization: The Contribution of Instances

Flores, Murillo Lagranha; Santos, Elder Rizzon; Silveira, Rosemary Silva da

doi:10.13053/cys-23-3-3270

Cited by 1 publication

(1 citation statement)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The main idea behind TF-IDF [36] is to find words with unique traits, and it can be used to make microtext lines easier to read. MMR considers the similarity between the extracted text and the entire document and between the extracted sentences and the summaries [37,38]. After calculating the similarity of each sentence to the entire text and between two sentences, the algorithm formula is iterated to rank the sentence scores of the microblog texts.…”

Section: Key Sentence Extractionmentioning

confidence: 99%

Applicability Analysis and Ensemble Application of BERT with TF-IDF, TextRank, MMR, and LDA for Topic Classification Based on Flood-Related VGI

Yao

et al. 2023

IJGI

View full text Add to dashboard Cite

Volunteered geographic information (VGI) plays an increasingly crucial role in flash floods. However, topic classification and spatiotemporal analysis are complicated by the various expressions and lengths of social media textual data. This paper conducted applicability analysis on bidirectional encoder representation from transformers (BERT) and four traditional methods, TextRank, term frequency–inverse document frequency (TF-IDF), maximal marginal relevance (MMR), and linear discriminant analysis (LDA), and the results show that for user type, BERT performs best on the Government Affairs Microblog, whereas LDA-BERT performs best on the We Media Microblog. As for text length, TF-IDF-BERT works better for texts with a length of <70 and length >140 words, and LDA-BERT performs best with a text length of 70–140 words. For the spatiotemporal evolution pattern, the study suggests that in a Henan rainstorm, the textual topics follow the general pattern of “situation-tips-rescue”. Moreover, this paper detected the hotspot of “Metro Line 5” related to a Henan rainstorm and discovered that the topical focus of the Henan rainstorm spatially shifts from Zhengzhou, first to Xinxiang, and then to Hebi, showing a remarkable tendency from south to north, which was the same as the report issued by the authorities. We integrated multi-methods to improve the overall topic classification accuracy of Sina microblogs, facilitating the spatiotemporal analysis of flooding.

show abstract

Section: Key Sentence Extractionmentioning

confidence: 99%