We describe an open source workbench that offers advanced computer aided translation (CAT) functionality: post-editing machine translation (MT), interactive translation prediction (ITP), visualization of word alignment, extensive logging with replay mode, integration with eye trackers and e-pen.
CASMACAT is a modular, web-based translation workbench that offers advanced functionalities for computer-aided translation and the scientific study of human translation: automatic interaction with machine translation (MT) engines and translation memories (TM) to obtain raw translations or close TM matches for conventional post-editing; interactive translation prediction based on an MT engine's search graph, detailed recording and replay of edit actions and translator's gaze (the latter via eye-tracking), and the support of e-pen as an alternative input device.The system is open source sofware and interfaces with multiple MT systems.
Received: date / Accepted: date Abstract We conducted a field trial in computer-assisted professional translation to compare Interactive Translation Prediction (ITP) against conventional postediting (PE) of machine translation (MT) output. In contrast to the conventional PE set-up, where an MT system first produces a static translation hypothesis that is then edited by a professional translator (hence "post-editing"), ITP constantly updates the translation hypothesis in real time in response to user edits. Our study involved nine professional translators and four reviewers working with the webbased CasMaCat workbench. Various new interactive features aiming to assist the post-editor were also tested in this trial. Our results show that even with little training, ITP can be as productive as conventional PE in terms of the total time required to produce the final translation. Moreover, in the ITP setting translators require fewer key strokes to arrive at the final version of their translation.
Multilingual speakers are able to switch from one language to the other ("code-switch") between or within sentences. Because the underlying cognitive mechanisms are not well understood, in this study we use computational cognitive modeling to shed light on the process of code-switching. We employed the Bilingual Dual-path model, a Recurrent Neural Network of bilingual sentence production (Tsoukala et al., 2017) and simulated sentence production in simultaneous Spanish-English bilinguals. Our first goal was to investigate whether the model would code-switch without being exposed to code-switched training input. The model indeed produced codeswitches even without any exposure to such input and the patterns of code-switches are in line with earlier linguistic work (Poplack, 1980). The second goal of this study was to investigate an auxiliary phrase asymmetry that exists in Spanish-English code-switched production. Using this cognitive model, we examined a possible cause for this asymmetry. To our knowledge, this is the first computational cognitive model that aims to simulate code-switched sentence production.
Multilingual speakers are able to switch from one language to the other ("code-switch'') between or within sentences. Because the underlying cognitive mechanisms are not well understood, in this study we use computational cognitive modeling to shed light on the process of code-switching.We employed the Bilingual Dual-path model, a Recurrent Neural Network of bilingual sentence production Tsoukala et al. (2017) and simulated sentence production in simultaneous Spanish-English bilinguals. Our first goal was to investigate whether the model would code-switch without being exposed to code-switched training input.The model indeed produced code-switches even without any exposure to such input and the patterns of code-switches are in line with earlier linguistic work Poplack (1980). The second goal of this study was to investigate an auxiliary phrase asymmetry that exists in Spanish-English code-switched production. Using this cognitive model, we examined a possible cause for this asymmetry. To our knowledge, this is the first computational cognitive model that aims to simulate code-switched sentence production.
We propose a number of refinements to the canonical approach to interactive translation prediction. By more permissive matching criteria, placing emphasis on matching the last word of the user prefix, and dealing with predictions to partially typed words, we observe gains in both word prediction accuracy (+5.4%) and letter prediction accuracy (+9.3%).
Code-switching is the alternation from one language to the other during bilingual speech. We present a novel method of researching this phenomenon using computational cognitive modeling. We trained a neural network of bilingual sentence production to simulate early balanced Spanish-English bilinguals, late speakers of English who have Spanish as a dominant native language, and late speakers of Spanish who have English as a dominant native language. The model produced code-switches even though it was not exposed to code-switched input. The simulations predicted how code-switching patterns differ between early balanced and late non-balanced bilinguals; the balanced bilingual simulation code-switches considerably more frequently, which is in line with what has been observed in human speech production. Additionally, we compared the patterns produced by the simulations with two corpora of spontaneous bilingual speech and identified noticeable commonalities and differences. To our knowledge, this is the first computational cognitive model simulating the code-switched production of non-balanced bilinguals and comparing the simulated production of balanced and non-balanced bilinguals with that of human bilinguals.
To test whether error-driven implicit learning can explain cross-language structural priming, we implemented three different models of bilingual sentence production: Spanish-English, verb-final Dutch-English, and verb-medial Dutch-English. With these models, we conducted simulation experiments that all revealed clear and strong cross-language priming effects.One of these experiments included structures with different word order between the two languages. This enabled us to distinguish between the error-driven learning account of structural priming and an alternative hybrid account which predicts that identical word order is required for cross-language priming. Cross-language priming did occur in our model between structures with different word order. This is in line with results from behavioural experiments.The results of the three experiments reveal varying degrees of evidence for stronger within-language priming than cross-language priming. This is consistent with results from behavioural studies.Overall, our findings support the viability of error-driven implicit learning as an account of cross-language structural priming.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.