2011
DOI: 10.1007/s10994-011-5272-5
|View full text |Cite
|
Sign up to set email alerts
|

Statistical topic models for multi-label document classification

Abstract: Machine learning approaches to multi-label document classification have to date largely relied on discriminative modeling techniques such as support vector machines. A drawback of these approaches is that performance rapidly drops off as the total number of labels and the number of labels per document increase. This problem is amplified when the label frequencies exhibit the type of highly skewed distributions that are often observed in real-world datasets. In this paper we investigate a class of generative st… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
180
0
3

Year Published

2014
2014
2019
2019

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 248 publications
(183 citation statements)
references
References 38 publications
0
180
0
3
Order By: Relevance
“…One category of using topic model idea for multi-label learning is to directly replace topics in Latent Dirichlet Allocation (LDA) (Blei et al 2003) by labels, such as Labeled LDA (Ramage et al 2009) and Flat-LDA (Rubin et al 2012). Prior-LDA (Rubin et al 2012) is further proposed to account for the label frequency differences within a corpus through introducing a label sampling step by multinomial distribution.…”
Section: Generative Models For Multi-label Learningmentioning
confidence: 99%
See 3 more Smart Citations
“…One category of using topic model idea for multi-label learning is to directly replace topics in Latent Dirichlet Allocation (LDA) (Blei et al 2003) by labels, such as Labeled LDA (Ramage et al 2009) and Flat-LDA (Rubin et al 2012). Prior-LDA (Rubin et al 2012) is further proposed to account for the label frequency differences within a corpus through introducing a label sampling step by multinomial distribution.…”
Section: Generative Models For Multi-label Learningmentioning
confidence: 99%
“…Prior-LDA (Rubin et al 2012) is further proposed to account for the label frequency differences within a corpus through introducing a label sampling step by multinomial distribution. However, the dependency between the labels is not considered, which is resolved by the Dependency-LDA (Rubin et al 2012) later. Parametric Mixture Models (Ueda and Saito 2002) are also proposed to capture the pairwise label correlation.…”
Section: Generative Models For Multi-label Learningmentioning
confidence: 99%
See 2 more Smart Citations
“…Nonetheless, a topic model is not only a clustering algorithm [20]. Specially, the characteristics of LDA have been discussed for multi-label problem in [21]: (1) it transforms the word-level statistics of each document to its label-level distribution; (2) it models all labels at the same time rather than treating each label independently. In multi-label classification problem, LDA can also incorporate a label set into its learning procedure and become a supervised model.…”
Section: Introductionmentioning
confidence: 99%