2019
DOI: 10.1109/access.2019.2893980
|View full text |Cite
|
Sign up to set email alerts
|

Hot Topic Detection Based on a Refined TF-IDF Algorithm

Abstract: In this paper, we propose a refined term frequency inversed document frequency (TF-IDF) algorithm called TA TF-IDF to find hot terms, based on time distribution information and user attention. We also put forward a method to generate new terms and combined terms, which are split by the Chinese word segmentation algorithm. Then, we extract hot news according to the hot terms, grouping them into K-means clusters so as to realize the detection of hot topics in news. The experimental results indicated that our met… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
31
1
5

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 77 publications
(40 citation statements)
references
References 16 publications
(22 reference statements)
1
31
1
5
Order By: Relevance
“…Finally, the news is clustered based on the representation and events are obtained. The method proposed in this paper solves the problems of a too large data dimension and low computational efficiency of traditional TF-IDF [43], as well as the problems of manual data annotation, which are an inability to identify new events that occurs when using the LDA method [33].…”
Section: Discussionmentioning
confidence: 99%
“…Finally, the news is clustered based on the representation and events are obtained. The method proposed in this paper solves the problems of a too large data dimension and low computational efficiency of traditional TF-IDF [43], as well as the problems of manual data annotation, which are an inability to identify new events that occurs when using the LDA method [33].…”
Section: Discussionmentioning
confidence: 99%
“…All items are classified into keywords that mean the attributes of items. The term frequency inverse document frequency (TF-IDF) is a popular keyword extraction method that computes a weight based on the frequency of appearance of all item attributes [23]. The TF-IDF calculates the weight matrix for the attributes of items.…”
Section: B Recommendation Systemmentioning
confidence: 99%
“…A method was proposed to produce brand nomenclatures and assorted nomenclatures, which were divided by the Chinese word segmentation algorithm. Based on the improved tf-idf algorithm, hot event problems could be found effectively [4]. The popular method of the characteristics of the processing is based on semantics.…”
Section: Feature Extractionmentioning
confidence: 99%