Bram Vanroy scite author profile

We identify a number of aspects that can boost the performance of Neural Fuzzy Repair (NFR), an easy-to-implement method to integrate translation memory matches and neural machine translation (NMT). We explore various ways of maximising the added value of retrieved matches within the NFR paradigm for eight language combinations, using Transformer NMT systems. In particular, we test the impact of different fuzzy matching techniques, sub-word-level segmentation methods and alignment-based features on overall translation quality. Furthermore, we propose a fuzzy match combination technique that aims to maximise the coverage of source words. This is supplemented with an analysis of how translation quality is affected by input sentence length and fuzzy match score. The results show that applying a combination of the tested modifications leads to a significant increase in estimated translation quality over all baselines for all language combinations.

show abstract

Metrics of Syntactic Equivalence to Assess Translation Difficulty

Vanroy

Clercq

Tezcan

et al. 2021

View full text Add to dashboard Cite

We propose three linguistically motivated metrics to quantify syntactic equivalence between a source sentence and its translation. Syntactically Aware Cross (SACr) measures the degree of word group reordering by creating syntactically motivated groups of words that are aligned. Secondly, an intuitive approach is to compare the linguistic labels of the word-aligned source and target tokens. Finally, on a deeper linguistic level, Aligned Syntactic Tree Edit Distance (ASTrED) compares the dependency structure of both sentences. To be able to compare source and target dependency labels we make use of Universal Dependencies (UD). We provide an analysis of our metrics by comparing them with translation process data in mixed models. Even though our examples and analysis focus on English as the source language and Dutch as the target language, the proposed metrics can be applied to any language for which UD models are attainable. An open-source implementation is made available.

show abstract

Correlating process and product data to get an insight into translation difficulty

2019

View full text Add to dashboard Cite

Is linguistic decision-making constrained by the same cognitive factors in student and in professional translation?

Sutter

Lefer

Vanroy

2023

IJLCR

View full text Add to dashboard Cite

This article analyses the extent to which four well-known general cognitive constraints – syntactic priming, cognitive routinisation, markedness of coding and structural integration – impact the linguistic output of translation students and professional translators similarly. It takes subject placement variation in Dutch as a test case to gauge the effect of the four constraints and relies on a controlled corpus of student and professional French-to-Dutch L1 news translations, from which all declarative main clauses with either a preverbal or a postverbal subject were extracted. All corpus instances were annotated for four random variables, the fixed variable expertise and ten other fixed variables, which were considered good proxies for the cognitive constraints. A mixed-effects regression analysis reveals that by and large the cognitive constraints have an identical effect on student and professional translators’ output, with priming and structural integration having the strongest impact on subject placement. However, students diverge from professionals when translating French clauses with a left-dislocated adjunct into Dutch, which is interpreted as an indication of a difference in automatisation when dealing with specific French-Dutch cross-linguistic differences.

show abstract

huggingface/transformers: Trainer, TFTrainer, Multilingual BART, Encoder-decoder improvements, Generation Pipeline

Wolf¹,

Debut²,

Chaumond³

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Bram Vanroy

Towards a Better Integration of Fuzzy Matches in Neural Machine Translation through Data Augmentation

Metrics of Syntactic Equivalence to Assess Translation Difficulty

Correlating process and product data to get an insight into translation difficulty

Is linguistic decision-making constrained by the same cognitive factors in student and in professional translation?

huggingface/transformers: Trainer, TFTrainer, Multilingual BART, Encoder-decoder improvements, Generation Pipeline

Contact Info

Product

Resources

About