A Novel Similarity Calculation Method Based on Chinese Sentence Keyword Weight

Yu, Yang−Xin; Wang, Liuyang

doi:10.4304/jsw.9.5.1151-1156

Cited by 2 publications

(1 citation statement)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Before clustering, it needs to use the text similarity calculation method to establish a similarity matrix, and then use the appropriate clustering algorithm to cluster the clusters. Therefore, a good similarity calculation method can greatly improve the efficiency of clustering [8].…”

Section: Similarity Calculation In Text Clusteringmentioning

confidence: 99%

Application and analysis of text similarity in text clustering in the Chinese context

Fan

2023

ACE

View full text Add to dashboard Cite

With the development of the Internet, information sharing is higher, and the amount of information that each user is exposed to is increasing. How to find the information peoples want from so much information is a very important question. The vast majority of these resources are related to textual information. The most intuitive manifestation of these problems is that when people usually use search engines, enter a piece of text, and search out the relevant website, if the algorithm is not good, the search results will be very unsatisfactory. Therefore, this paper studies the application of text similarity in text clustering in the Chinese context. First, the basic concept of text similarity is introduced. In addition, text clustering is explained/explained from three aspects: definition, application, and general processing process. Secondly, combined with the existing data, some mainstream clustering algorithms are comprehensively summarized. Then, combined with the above content, the similarity calculation method in text clustering is analyzed. Finally, the above methods are compared and analyzed according to the experimental results in the Python environment.

show abstract

Section: Similarity Calculation In Text Clusteringmentioning

confidence: 99%