Do Multilingual Neural Machine Translation Models Contain Language Pair Specific Attention Heads?

Kim, Zae Myung; Besacier, Laurent; Nikoulina, Vassilina; Schwab, Didier

doi:10.18653/v1/2021.findings-acl.250

Cited by 2 publications

(2 citation statements)

References 36 publications

(43 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Additionally, Shao and Nivre (2016) demonstrated the effectiveness of convolutional neural networks (CNNs) in English-to-Chinese transliteration, surpassing traditional PBSMT methods in terms of accuracy. Moreover, Jankowski et al (2021) presented a paper focusing on multilingual NMT models tailored for Slavic languages, including Polish and Slovak. The paper introduced soft decoding, a technique that allows the NMT model to generate multiple translations simultaneously.…”

Section: Literature Reviewmentioning

confidence: 99%

Improving neural machine translation by integrating transliteration for low-resource English–Assamese language

Nath,

Sarkar,

Mukhopadhyay

et al. 2024

NLP

View full text Add to dashboard Cite

In machine translation (MT), one of the challenging tasks is to translate the proper nouns and technical terms from the source language to the target language while preserving the phonetic equivalent of original term. Machine transliteration, an essential part of MT systems, plays a vital role in handling proper nouns and technical terms. In this paper, a hybrid attention-based encoder–decoder machine transliteration system is proposed for the low-resource English to the Assamese language. In this work, the proposed machine transliteration system is integrated with the previously published hybrid attention-based encoder–decoder neural MT model to improve the translation quality of English to the Assamese language. The proposed integrated MT system demonstrated good results across various performance metrics such as BLEU, sacreBLEU, METEOR, chrF, RIBES, and TER for English to Assamese translation. Additionally, human evaluation was also conducted to assess translation quality. The proposed integrated MT system was compared with two existing systems: the Bing translation service model and the Samanantar Indic translation model.

show abstract

Section: Literature Reviewmentioning

confidence: 99%

Improving neural machine translation by integrating transliteration for low-resource English–Assamese language

Nath,

Sarkar,

Mukhopadhyay

et al. 2024

NLP

View full text Add to dashboard Cite

show abstract

“…13 CBOW results are shown in Figure9in the Appendix; there are also many identifiable clusters like body parts and clothing, but many others are less clear than clusters from the LSTM. 14 This approach is similar to the category distinction test for masked language models inKim and Smolensky (2021). 15 The Captioning LSTM always needs an image input, so we used the mean image frame of the training set in this evaluation.…”

mentioning

confidence: 99%

Finding Structure in One Child's Linguistic Experience

et al. 2023

View full text Add to dashboard Cite

Neural network models have recently made striking progress in natural language processing, but they are typically trained on orders of magnitude more language input than children receive. What can these neural networks, which are primarily distributional learners, learn from a naturalistic subset of a single child's experience? We examine this question using a recent longitudinal dataset collected from a single child, consisting of egocentric visual data paired with text transcripts. We train both languageonly and vision-and-language neural networks and analyze the linguistic knowledge they acquire. In parallel with findings from Jeffrey Elman's seminal work, the neural networks form emergent clusters of words corresponding to syntactic (nouns, transitive and intransitive verbs) and semantic categories (e.g., animals and clothing), based solely on one child's linguistic input. The networks also acquire sensitivity to acceptability contrasts from linguistic phenomena, such as determiner-noun agreement and argument structure. We find that incorporating visual information produces an incremental gain in predicting words in context, especially for syntactic categories that are comparatively more easily grounded, such as nouns and verbs, but the underlying linguistic representations are not fundamentally altered. Our findings demonstrate which kinds of linguistic knowledge are learnable from a snapshot of a single child's real developmental experience.

show abstract

Do Multilingual Neural Machine Translation Models Contain Language Pair Specific Attention Heads?

Cited by 2 publications

References 36 publications

Improving neural machine translation by integrating transliteration for low-resource English–Assamese language

Improving neural machine translation by integrating transliteration for low-resource English–Assamese language

Finding Structure in One Child's Linguistic Experience

Contact Info

Product

Resources

About