The review states that there has been a higher demand practical corpus knowledge and skills in the recent time, where learning to use query tools requires a greater awareness of the interconnection of lexis and grammar. The revue values then the new textbook, workbook and glossary by James Thomas, who focuses on the domain of language teaching and on the work with Sketch Engine, a set of software tools, offering a wide range of services with regard to analysis into concordances, frequency statistics, co-occurrence patterns, contrasts ets. The reviewed books are aimed at teachers and students of English, linguists and translators. The review describes the books as witty, highly-readable narrative written by an enthusiast with vast experience in ELT, whose line of argumentation is very compelling and engaging.
We use a range of morpho-syntactic features inspired by research in register studies (e.g. Biber, 1995;Neumann, 2013) and translation studies (e.g. Ilisei et al., 2010;Zanettin, 2013;Kunilovskaya and Kutuzov, 2018) to reveal the association between translationese and human translation quality. Translationese is understood as any statistical deviations of translations from non-translations (Baker, 1993) and is assumed to affect the fluency of translations, rendering them foreign-sounding and clumsy of wording and structure. This connection is often posited or implied in the studies of translationese or translational varieties (De Sutter et al., 2017), but is rarely directly tested. Our 45 features include frequencies of selected morphological forms and categories, some types of syntactic structures and relations, as well as several overall text measures extracted from Universal Dependencies annotation. The research corpora include English-to-Russian professional and student translations of informational or argumentative newspaper texts and a comparable corpus of non-translated Russian.Our results indicate lack of direct association between translationese and quality in our data: while our features distinguish translations and non-translations with the near perfect accuracy, the performance of the same algorithm on the quality classes barely exceeds the chance level.
In this paper, we present a distributional word embedding model trained on one of the largest available Russian corpora: Araneum Russicum Maximum (over 10 billion words crawled from the web). We compare this model to the model trained on the Russian National Corpus (RNC). The two corpora are much different in their size and compilation procedures. We test these differences by evaluating the trained models against the Russian part of the Multilingual SimLex999 semantic similarity dataset. We detect and describe numerous issues in this dataset and publish a new corrected version. Aside from the already known fact that the RNC is generally a better training corpus than web corpora, we enumerate and explain fine differences in how the models process semantic similarity task, what parts of the evaluation set are difficult for particular models and why. Additionally, the learning curves for both models are described, showing that the RNC is generally more robust as training material for this task.
One of the relatively recent trends in learner corpora research is building and exploiting learner translator corpora. Within corpus-based translation studies (CTS) translations are approached as a special variety of the target language. They are usually represented by texts produced by professional translators and are studied as manifestations of the current translational norm. Learner translations can be seen as a more specific variant of the said variety, which is likely to deviate from the accepted translational norm. As of now, typical linguistic features of learner translations as opposed to professional ones are only tentatively described. We hypothesize that these texts should demonstrate heavier translationese features due to the lack of professional translational skills, comparatively poor source language processing competence and target language production skills. The aim of this research is to compare learner and professional Russian translations of English mass-media texts with the reference Russian corpus of non-translations to reveal lexical differences between the three. We found that learner translations consistently showed more distance from non-translations than their professional counterparts, while both learner and professional translations undoubtedly had discursive features which made them linguistically different from naturally occurring language. These findings might help define (non)professionalism in translation and shed light on correlation between the linguistic features of a given text and translation quality, as well as contribute to pedagogical approaches to translator education.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.