2015
DOI: 10.1007/s10618-015-0442-x
|View full text |Cite
|
Sign up to set email alerts
|

C-BiLDA extracting cross-lingual topics from non-parallel texts by distinguishing shared from unshared content

Abstract: We study the problem of extracting cross-lingual topics from non-parallel multilingual text datasets with partially overlapping thematic content (e.g., aligned Wikipedia articles in two different languages). To this end, we develop a new bilingual probabilistic topic model called comparable bilingual latent Dirichlet allocation (C-BiLDA), which is able to deal with such comparable data, and, unlike the standard bilingual LDA model (BiLDA), does not assume the availability of document pairs with identical topic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 40 publications
(55 reference statements)
0
10
0
Order By: Relevance
“…In consequence, this also influences topic distributions of related words not occurring in the dictionary. Another group of models utilizes alignments at the document level (Mimno, Wallach, Naradowsky, Smith, & McCallum, 2009;Platt, Toutanova, & Yih, 2010;Vulić, De Smet, & Moens, 2011;Fukumasu, Eguchi, & Xing, 2012;Heyman, Vulić, & Moens, 2016) to induce shared topical spaces. The very same level of supervision (i.e., document alignments) is used by several cross-lingual word embedding models, surveyed in Section 8.…”
Section: A Brief History Of Cross-lingual Word Representationsmentioning
confidence: 99%
“…In consequence, this also influences topic distributions of related words not occurring in the dictionary. Another group of models utilizes alignments at the document level (Mimno, Wallach, Naradowsky, Smith, & McCallum, 2009;Platt, Toutanova, & Yih, 2010;Vulić, De Smet, & Moens, 2011;Fukumasu, Eguchi, & Xing, 2012;Heyman, Vulić, & Moens, 2016) to induce shared topical spaces. The very same level of supervision (i.e., document alignments) is used by several cross-lingual word embedding models, surveyed in Section 8.…”
Section: A Brief History Of Cross-lingual Word Representationsmentioning
confidence: 99%
“…that are annotated on the Reuters documents, for example: when an English document and a Spanish document are both annotated with the same global label they are considered to have comparable content and are added as a document pair to the comparable corpus. We analysed the resulting dataset with multilingual probabilistic topic models: Bilingual Latent Dirichlet Allocation (BiLDA) [53] and Comparable Bilingual Latent Dirichlet Allocation (C-BiLDA) [54]. We found that, although the C-BiLDA model could uncover some interesting cross-lingual topics (clusters of related words), the dataset was not well-suited for inducing translations as the domain was too broad and the comparability across languages too low.…”
Section: Terminology Extraction From Comparable Textmentioning
confidence: 99%
“…All three models learn bilingual word representations from subject-aligned document pairs only. Multilingual topic modeling has shown to be a robust framework for learning bilingual representations from such non-parallel data: BiLDA has been successfully applied to BLI [56] and C-BiLDA is a more recent extension to BiLDA that learns higher-quality representations when the aligned document pairs exhibit a lower degree of parallelism [54]. BWESG is a simple but effective extension to continuous skip-gram.…”
Section: Comparison Of Weakly-supervised Word-level Bli Modelsmentioning
confidence: 99%
“…Most multilingual topic models are generative admixture models in which the conditional probabilities can be factorized into different levels, thus KL-divergence term in Theorem 3 can be decomposed and analyzed in the same way as in this section for models that have transfer at other levels, such as , Heyman et al (2016), and Hu et al (2014). For example, if a model has word-level transfer, i.e., the model assumes that word translations share the same distributions, we have a KL-divergence term as,…”
Section: Multilevel Transfermentioning
confidence: 99%