Text Clustering Algorithm Based on Random Cluster Core

Huang, Longjun; Cheng, Meng-Zhen; Xiao, Yao

doi:10.1051/itmconf/20160705001

Cited by 2 publications

(2 citation statements)

References 6 publications

(3 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The main reason is that the computational overhead of clustering algorithms tends to be large. When the amount of data rises to a certain extent, most clustering algorithms cannot be used, so the time complexity of most clustering algorithms needs to be considered [132]. K-means, which belongs to the partitioning clustering algorithm, is a commonly used text clustering algorithm whose disadvantage is that it cannot effectively determine the number of clusters and select the initial clustering point, and has poor performance on high dimensional data, etc.…”

Section: Text Clusteringmentioning

confidence: 99%

A Review of Text Corpus-Based Tourism Big Data Mining

Zhang

et al. 2019

Applied Sciences

View full text Add to dashboard Cite

With the massive growth of the Internet, text data has become one of the main formats of tourism big data. As an effective expression means of tourists' opinions, text mining of such data has big potential to inspire innovations for tourism practitioners. In the past decade, a variety of text mining techniques have been proposed and applied to tourism analysis to develop tourism value analysis models, build tourism recommendation systems, create tourist profiles, and make policies for supervising tourism markets. The successes of these techniques have been further boosted by the progress of natural language processing (NLP), machine learning, and deep learning. With the understanding of the complexity due to this diverse set of techniques and tourism text data sources, this work attempts to provide a detailed and up-, destination image analysis, market demand, etc. Our work also provides guidelines for constructing new tourism big data applications and outlines promising research areas in this field for incoming years. Appl. Sci. 2019, 9, 3300 2 of 27 the travel. In order to utilize this user-generated content properly and further to meet the needs of tourists and promote the tourism industry, we need to analyze and exploit tourists' needs and opinions, and then identify the problems of tourism services or destinations, which has become a new path for tourism development. Besides, as tourism needs become increasingly personalized, visitors begin to pursue self-likeness, self-worth, and diversified travel experiences, and they are no longer willing to endure delays or waits. How to recognize and respond to visitors' behaviors and needs quickly and identify potential customers have become essential factors for the success of tourism stakeholders. By exploiting the subjective information contained in tourism text data, we can assist tourism stakeholders to provide better services for tourists.A large number of text mining techniques have been proposed and applied to tourism text data analysis for creating tourist profiles [8][9][10][11][12][13][14][15] and making effective market supervision [16][17][18][19][20][21][22][23][24][25]. These approaches exploit a variety of text representation strategies [26][27][28][29][30][31][32] and use different NLP techniques for topic extraction [33], text classification [34], sentiment analysis [35], and text clustering [36]. Moreover, while aiming to make computers understand human language, NLP has become the essential tool for text data analysis and is undergoing fast-pace growing based on the applications of deep learning in word embedding, syntax analysis, machine translation, and text understanding. Machine learning-based NLP techniques have been widely used in tourism text analysis, with superior results [19,25]. In addition, due to its high capability for extracting selective and invariant features from texts, and its independency of prior knowledge and linguistic resources, deep learning has been reported to achieve higher performance than other approaches on many NL...

show abstract

Section: Text Clusteringmentioning

confidence: 99%

A Review of Text Corpus-Based Tourism Big Data Mining

Zhang

et al. 2019

Applied Sciences

View full text Add to dashboard Cite

show abstract

“…In this paper, we adapt the delayed combination approach and analyze its effectiveness by comparing it to the existing, commonly used, early combination approach, inspired by Huang et al, 2016. We adapt character-level CNN or LSTM-based word encoding and recent contextualized word embedding and designed CNN-based sentence encoding using a named entity dictionary as supplementary feature encodings, in addition to the common pre-trained word embedding. We pass the pre-trained word embedding and the contextualized word embedding through the separate bidirectional LSTM blocks, respectively, and then we combine the outputs with the CNN or LSTM-based word encoding and the CNN-based sentence encoding.…”

mentioning

confidence: 99%

Natural Language Processing: Emerging Neural Approaches and Applications

2022

View full text Add to dashboard Cite

show abstract

Text Clustering Algorithm Based on Random Cluster Core

Cited by 2 publications

References 6 publications

A Review of Text Corpus-Based Tourism Big Data Mining

A Review of Text Corpus-Based Tourism Big Data Mining

Natural Language Processing: Emerging Neural Approaches and Applications

Contact Info

Product

Resources

About