Proceedings of the 20th ACM International Conference on Information and Knowledge Management 2011
DOI: 10.1145/2063576.2063689
|View full text |Cite
|
Sign up to set email alerts
|

Transferring topical knowledge from auxiliary long texts for short text clustering

Abstract: With the rapid growth of social Web applications such as Twitter and online advertisements, the task of understanding short texts is becoming more and more important. Most traditional text mining techniques are designed to handle long text documents. For short text messages, many of the existing techniques are not effective due to the sparseness of text representations. To understand short messages, we observe that it is often possible to find topically related long texts, which can be utilized as the auxiliar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
110
0
1

Year Published

2014
2014
2020
2020

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 173 publications
(112 citation statements)
references
References 16 publications
(23 reference statements)
1
110
0
1
Order By: Relevance
“…First, the LDA model is based on the assumption that each tourist review is assumed to be a "bag-of-words" document. LDA may not obtain a desirable performance when applied to short documents [60]. Second, selecting a globally optimal topic number to train the LDA model is difficult.…”
Section: Discussionmentioning
confidence: 99%
“…First, the LDA model is based on the assumption that each tourist review is assumed to be a "bag-of-words" document. LDA may not obtain a desirable performance when applied to short documents [60]. Second, selecting a globally optimal topic number to train the LDA model is difficult.…”
Section: Discussionmentioning
confidence: 99%
“…Although rarely, there also some people did the work that transfer methods or skills form the long text to the short. For example, Jin et al [11] proposed a DLDA model. It extracted two sets of topics from the source and target domains and used a binary switch variable to control the forming process of the documents.…”
Section: Transfer Learning From the Long Text Domain To The Shortmentioning
confidence: 99%
“…Some researchers aim to aggregate a subset of short texts to form a long texts. And then topic models are applied over these long texts [7]. Other ingenious methods, such as MA-LDA [8] and MB-LDA [9], take some features into consideration and incorporate these features into LDA.…”
Section: Introductionmentioning
confidence: 99%