Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2022
DOI: 10.18653/v1/2022.acl-short.53
|View full text |Cite
|
Sign up to set email alerts
|

As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning

Abstract: Omission and addition of content is a typical issue in neural machine translation. We propose a method for detecting such phenomena with off-the-shelf translation models. Using contrastive conditioning, we compare the likelihood of a full sequence under a translation model to the likelihood of its parts, given the corresponding source or target sequence.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 83 publications
0
2
0
Order By: Relevance
“…These, however, were shown not to be indicative of hallucinations (Guerreiro et al, 2023), which highlights the importance of our human-annotated data. For omissions, previous work mostly focused on empty translations (Stahlberg and Byrne, 2019;Vijayakumar et al, 2016) with some work using artificially created undertranslations (Vamvas and Sennrich, 2022). As we saw in Section 6, the latter is unlikely to be helpful when evaluating detection methods.…”
Section: Additional Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…These, however, were shown not to be indicative of hallucinations (Guerreiro et al, 2023), which highlights the importance of our human-annotated data. For omissions, previous work mostly focused on empty translations (Stahlberg and Byrne, 2019;Vijayakumar et al, 2016) with some work using artificially created undertranslations (Vamvas and Sennrich, 2022). As we saw in Section 6, the latter is unlikely to be helpful when evaluating detection methods.…”
Section: Additional Related Workmentioning
confidence: 99%
“…In contrast to sentence-level detection, detecting pathologies at the word level received much less attention. In terms of both available data and detectors, previous attempts were rather limited (Zhou et al, 2021;Vamvas and Sennrich, 2022). Here, we want to facilitate future research in this direction.…”
Section: Word-level Detectionmentioning
confidence: 99%