Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-srw.17
|View full text |Cite
|
Sign up to set email alerts
|

Data Augmentation with Unsupervised Machine Translation Improves the Structural Similarity of Cross-lingual Word Embeddings

Abstract: Unsupervised cross-lingual word embedding (CLWE) methods learn a linear transformation matrix that maps two monolingual embedding spaces that are separately trained with monolingual corpora. This method relies on the assumption that the two embedding spaces are structurally similar, which does not necessarily hold true in general. In this paper, we argue that using a pseudo-parallel corpus generated by an unsupervised machine translation model facilitates the structural similarity of the two embedding spaces a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 23 publications
(15 reference statements)
0
3
0
Order By: Relevance
“…Paraphrasing Thesauruses Zhang et al [5], Wei et al [6], Coulombe et al [7] Semantic Embeddings Wang et al [8] MLMs Jiao et al [9] Rules Coulombe et al [7], Regina et al [10], Louvan et al [11] Machine Translation Back-translation Xie et al [12], Zhang et al [13] Unidirectional Translation Nishikawa et al [14], Bornea et al [15] Model Generation Hou et al [16], Li et al [17], Liu et al [18] Noising Swapping Wei et al [6], Luque et al [19], Yan et al [20] Deletion Wei et al [6], Peng et al [21], Yu et al [22] Insertion Wei et al [6], Peng et al [21], Yan et al [20] Substitution Coulombe et al [7], Xie et al [23], Louvan et al [11] Mixup Guo et al [24], Cheng et al [25] Sampling Rules Min et al [26], Liu et al [27] Seq2Seq Models Kang et al [28], Zhang et al [13], Raille et al [29] Language Models…”
Section: Da For Nlpmentioning
confidence: 99%
See 2 more Smart Citations
“…Paraphrasing Thesauruses Zhang et al [5], Wei et al [6], Coulombe et al [7] Semantic Embeddings Wang et al [8] MLMs Jiao et al [9] Rules Coulombe et al [7], Regina et al [10], Louvan et al [11] Machine Translation Back-translation Xie et al [12], Zhang et al [13] Unidirectional Translation Nishikawa et al [14], Bornea et al [15] Model Generation Hou et al [16], Li et al [17], Liu et al [18] Noising Swapping Wei et al [6], Luque et al [19], Yan et al [20] Deletion Wei et al [6], Peng et al [21], Yu et al [22] Insertion Wei et al [6], Peng et al [21], Yan et al [20] Substitution Coulombe et al [7], Xie et al [23], Louvan et al [11] Mixup Guo et al [24], Cheng et al [25] Sampling Rules Min et al [26], Liu et al [27] Seq2Seq Models Kang et al [28], Zhang et al [13], Raille et al [29] Language Models…”
Section: Da For Nlpmentioning
confidence: 99%
“…In the task of unsupervised cross-lingual word embeddings (CLWEs), Nishikawa et al [14] build pseudo-parallel corpus with an unsupervised machine translation model. The authors first train unsupervised machine translation (UMT) models using the source/target training corpora and then translate the corpora using the UMT models.…”
Section: Machine Translationmentioning
confidence: 99%
See 1 more Smart Citation