Input Combination Strategies for Multi-Source Transformer Decoder

Libovický, Jindřich; Helcl, Jindřich; Mareċek, David

doi:10.18653/v1/w18-6326

Cited by 53 publications

(48 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Follow-up studies extend the decoder-based visual attention approach in different ways: Calixto et al (2017) reimplement the gating mechanism (Xu et al 2015) to rescale the magnitude of the visual information before the fusion, while Libovický and Helcl (2017) introduce the hierarchical attention which replaces the concatenative fusion with a new attention layer that dynamically weighs the modality-specific context vectors. Finally, Arslan et al (2018) and Libovický et al (2018) introduce the same idea into the Transformer-based (Vaswani et al 2017) architectures. Besides revisiting the hierarchical attention, Libovický et al (2018) also introduce parallel and serial variants.…”

Section: Visual Attentionmentioning

confidence: 99%

Multimodal machine translation through visuals and speech

Sulubacak

Çağlayan

Grönroos

et al. 2020

Machine Translation

View full text Add to dashboard Cite

Multimodal machine translation involves drawing information from more than one modality, based on the assumption that the additional modalities will contain useful alternative views of the input data. The most prominent tasks in this area are spoken language translation, image-guided translation, and video-guided translation, which exploit audio and visual modalities, respectively. These tasks are distinguished from their monolingual counterparts of speech recognition, image captioning, and video captioning by the requirement of models to generate outputs in a different language. This survey reviews the major data resources for these tasks, the evaluation campaigns concentrated around them, the state of the art in end-to-end and pipeline approaches, and also the challenges in performance evaluation. The paper concludes with a discussion of directions for future research in these areas: the need for more expansive and challenging datasets, for targeted evaluations of model performance, and for multimodality in both the input and output space.

show abstract

Section: Visual Attentionmentioning

confidence: 99%

Multimodal machine translation through visuals and speech

Sulubacak

Çağlayan

Grönroos

et al. 2020

Machine Translation

View full text Add to dashboard Cite

show abstract

“…We also experimented with a hierarchical attention mechanism along the lines of Libovický and Helcl (2017) and Libovický et al (2018), but as this did not outperform the simpler combination mechanism in (5) in internal testing, our submitted systems utilized the latter.…”

Section: Multi-encoder Transformermentioning

confidence: 99%

“…3. Training a multi-encoder (Libovický and Helcl, 2017;Libovický et al, 2018) Transformer system (Vaswani et al, 2017) from…”

Section: Introductionmentioning

confidence: 99%

“…5).R d model ×d k and W O i (s) ∈ R d k ×d model are trainableparameter matrices which project the key, query and value into a smaller dimensionality. Together withd k = d model /h, we have C (s) ∈ R n×d model .Next, we combine the outputs from the different encoders with a simple projection and sum, similar to whatLibovický et al (2018) refer to as "parallel":C…”

mentioning

confidence: 99%

See 1 more Smart Citation

Multi-Source Transformer for Kazakh-Russian-English Neural Machine Translation

Littell¹,

Lo²,

Larkin³

et al. 2019

Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

View full text Add to dashboard Cite

show abstract

“…In order to prevent that the errors in the Apertium translation are propagated to the output, the decoder should focus mostly on the SL input. However, according to the analysis of attention carried out by Libovickỳ et al (2018), in the serial multisource architecture of Marian the output seems to be built with information from all inputs. We plan to explore more multi-source architectures in the future.…”

Section: Hybridization With Rule-based Machine Translationmentioning

confidence: 99%

The Universitat d’Alacant Submissions to the English-to-Kazakh News Translation Task at WMT 2019

Sánchez-Cartagena¹,

Pérez-Ortiz²,

Sánchez-Martínez³

2019

Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

View full text Add to dashboard Cite

This paper describes the two submissions of Universitat d'Alacant to the English-to-Kazakh news translation task at WMT 2019. Our submissions take advantage of monolingual data and parallel data from other language pairs by means of iterative backtranslation, pivot backtranslation and transfer learning. They also use linguistic information in two ways: morphological segmentation of Kazakh text, and integration of the output of a rule-based machine translation system. Our systems were ranked 2 nd in terms of chrF++ despite being built from an ensemble of only 2 independent training runs.

show abstract

Input Combination Strategies for Multi-Source Transformer Decoder

Cited by 53 publications

References 12 publications

Multimodal machine translation through visuals and speech

Multimodal machine translation through visuals and speech

Multi-Source Transformer for Kazakh-Russian-English Neural Machine Translation

The Universitat d’Alacant Submissions to the English-to-Kazakh News Translation Task at WMT 2019

Contact Info

Product

Resources

About