Transferring topical knowledge from auxiliary long texts for short text clustering

Jin, Ou; Liu, Nathan N.; Zhao, Kai; Yu, Yong; Yang, Qiang

doi:10.1145/2063576.2063689

Cited by 175 publications

(113 citation statements)

References 16 publications

(23 reference statements)

Supporting

Mentioning

110

Contrasting

Unclassified

Order By: Relevance

“…First, the LDA model is based on the assumption that each tourist review is assumed to be a "bag-of-words" document. LDA may not obtain a desirable performance when applied to short documents [60]. Second, selecting a globally optimal topic number to train the LDA model is difficult.…”

Section: Discussionmentioning

confidence: 99%

Investigating Online Destination Images Using a Topic-Based Sentiment Analysis Approach

Ren

Hong

2017

Sustainability

View full text Add to dashboard Cite

With the development of Web 2.0, many studies have tried to analyze tourist behavior utilizing user-generated contents. The primary purpose of this study is to propose a topic-based sentiment analysis approach, including a polarity classification and an emotion classification. We use the Latent Dirichlet Allocation model to extract topics from online travel review data and analyze the sentiments and emotions for each topic with our proposed approach. The top frequent words are extracted for each topic from online reviews on Ctrip.com. By comparing the relative importance of each topic, we conclude that many tourists prefer to provide "suggestion" reviews. In particular, we propose a new approach to classify the emotions of online reviews at the topic level utilizing an emotion lexicon, focusing on specific emotions to analyze customer complaints. The results reveal that attraction "management" obtains most complaints. These findings may provide useful insights for the development of attractions and the measurement of online destination image. Our proposed method can be used to analyze reviews from many online platforms and domains.

show abstract

Section: Discussionmentioning

confidence: 99%

Investigating Online Destination Images Using a Topic-Based Sentiment Analysis Approach

Ren

Hong

2017

Sustainability

View full text Add to dashboard Cite

show abstract

“…Although rarely, there also some people did the work that transfer methods or skills form the long text to the short. For example, Jin et al [11] proposed a DLDA model. It extracted two sets of topics from the source and target domains and used a binary switch variable to control the forming process of the documents.…”

Section: Transfer Learning From the Long Text Domain To The Shortmentioning

confidence: 99%

FSFP: Transfer Learning From Long Texts to the Short

Wei¹,

Zhang²,

Chu³

et al. 2014

Appl. Math. Inf. Sci.

View full text Add to dashboard Cite

Abstract:Transfer learning is a method that studies how to identify the useful knowledge and skills in the previous tasks, and uses them to the new tasks or domains. At present, the research on transfer learning mostly focuses on the field of long texts. However, the source data should be given for the transportation from long texts to the short ones, and the priori probability distribution of the data should be given at the same time. In order to solve the problems, the algorithm which is called FSFP (Free Source selection Free Priori probability distribution) is proposed. It can transfer knowledge from the long texts to the short ones. Latent semantic analysis is used to extract the key words as seed characteristic sets, which are semantically related to the long texts from the target domain. And then the graph structure of online information is built. With the help of the improved Laplacian Eigenmaps, the feature representations of highdimensional data are mapped to a low-dimensional space. Lastly, the target data are classified in the constraint of minimizing the mutual information between the instance and the feature representation. The experimental results on large data sets show the effectiveness of the new algorithm.

show abstract

“…Some researchers aim to aggregate a subset of short texts to form a long texts. And then topic models are applied over these long texts [7]. Other ingenious methods, such as MA-LDA [8] and MB-LDA [9], take some features into consideration and incorporate these features into LDA.…”

Section: Introductionmentioning

confidence: 99%

Vector Representation of Words for Detecting Topic Trends over Short Texts

He¹,

Du²,

Zhang³

2018

Proceedings of the 2018 International Conference on Mathematics, Modelling, Simulation and Algorithms (MMSA 2018)

View full text Add to dashboard Cite

Abstract-It is a critical task to infer discriminative and coherent topics from short texts. Furthermore, people not only want to know what kinds of topics can be extract from these short texts, but also desire to obtain the temporal dynamic evolution of these topics. In this paper, we present a novel model for short texts, referred as topic trend detection (TTD) model. Based on an optimized topic model we proposed, TTD model derives more typical terms and itemsets to represent topics of short texts and improves the coherence of topic representations. Ultimately, we extend the topic itemsets obtained from the optimized topic model by vector space representations of words to detect topic trends. Through extensive experiments on several real-world short text collections in Sina Microblog, the results show our method achieves comparable topic representations than state-of-the-art models, measured by topic coherence, and then show its application in identifying topic trends in Sina Microblog.

show abstract

Transferring topical knowledge from auxiliary long texts for short text clustering

Cited by 175 publications

References 16 publications

Investigating Online Destination Images Using a Topic-Based Sentiment Analysis Approach

Investigating Online Destination Images Using a Topic-Based Sentiment Analysis Approach

FSFP: Transfer Learning From Long Texts to the Short

Vector Representation of Words for Detecting Topic Trends over Short Texts

Contact Info

Product

Resources

About