2019
DOI: 10.1214/18-ejs1516
|View full text |Cite
|
Sign up to set email alerts
|

Convergence rates of latent topic models under relaxed identifiability conditions

Yining Wang

Abstract: In this paper we study the frequentist convergence rate for the Latent Dirichlet Allocation (Blei et al., 2003) topic models. We show that the maximum likelihood estimator converges to one of the finitely many equivalent parameters in Wasserstein's distance metric at a rate of n −1/4 without assuming separability or non-degeneracy of the underlying topics and/or the existence of more than three words per document, thus generalizing the previous works of Anandkumar et al. (2012Anandkumar et al. ( , 2014 from an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(8 citation statements)
references
References 24 publications
1
7
0
Order By: Relevance
“…For LDA and closely related topic models, there is a rich literature investigating identifiability under different assumptions (Anandkumar et al, 2012;Arora et al, 2012;Nguyen, 2015;Wang, 2019). Typically, when there is only one characteristic (p = 1), R ≥ 2 is necessary for identifiability; see Example 2 in Wang (2019).…”
Section: Strict Identifiability Conditionsmentioning
confidence: 99%
See 1 more Smart Citation
“…For LDA and closely related topic models, there is a rich literature investigating identifiability under different assumptions (Anandkumar et al, 2012;Arora et al, 2012;Nguyen, 2015;Wang, 2019). Typically, when there is only one characteristic (p = 1), R ≥ 2 is necessary for identifiability; see Example 2 in Wang (2019).…”
Section: Strict Identifiability Conditionsmentioning
confidence: 99%
“…For LDA and closely related topic models, there is a rich literature investigating identifiability under different assumptions (Anandkumar et al, 2012;Arora et al, 2012;Nguyen, 2015;Wang, 2019). Typically, when there is only one characteristic (p = 1), R ≥ 2 is necessary for identifiability; see Example 2 in Wang (2019). However, there has been limited consideration of identifiability of mixed membership models with multiple characteristics and one replication, i.e., p > 1 and R = 1.…”
Section: Strict Identifiability Conditionsmentioning
confidence: 99%
“…Therefore, identifiability can be guaranteed under very mild conditions; for example, one of such conditions is just C being of full rank . Under such Bayesian settings, posterior concentration rates have been established in Nguyen (2015) and Tang et al (2014), and convergence rates for the maximum likelihood estimator (MLE) have been established in Anandkumar et al ( , 2014 and Wang (2019).…”
Section: The Bayesian Approachmentioning
confidence: 99%
“…(A1) is commonly imposed for technical reasons in other related work, such as Nguyen (2015) and Wang (2019), to avoid singularity issues. The geometric interpretation of the assumption in (A2) on W c is that ConvpU 0 q should contain a ball of a constant radius, which is again imposed to avoid singularity issues when a large proportion of the mixing weight vectors are too concentrated.…”
Section: Consistency and Error Analysis Under Fixed Mixing Weightsmentioning
confidence: 99%
See 1 more Smart Citation