Trilingual aligned corpus – current state and new applicationsThis article describes current state of a trilingual parallel corpus consisted of texts in two Slavic (Bulgarian and Polish) and one Baltic language (Lithuanian). The corpus contains original literary texts (fiction, novels, and short stories) in one of the three languages with translations to the other two, and texts in other languages translated into Bulgarian, Polish, and Lithuanian. A part of the texts are aligned at the sentence level. The authors propose a semantic annotation of verbs appearing in these aligned texts that will facilitate contrastive studies of natural languages. A theoretical background for the proposed semantic annotation is briefly also discussed.
In view of the ambiguity of the term "semantics", the author shows the differences between the traditional lexical semantics and the contemporary semantics in the light of various semantic schools. She examines semantics differently in connection with contrastive studies where the description must necessary go from the meaning towards the linguistic form, whereas in traditional contrastive studies the description proceeded from the form towards the meaning. This requirement regarding theoretical contrastive studies necessitates construction of a semantic interlanguage, rather than only singling out universal semantic categories expressed with various language means. Such studies can be strongly supported by parallel corpora. However, in order to make them useful for linguists in manual and computer translations, as well as in the development of dictionaries, including online ones, we need not only formal, often automatic, annotation of texts, but also semantic annotation -which is unfortunately manual. In the article we focus on semantic annotation concerning time, aspect and quantification of names and predicates in the whole semantic structure of the sentence on the example of the "Polish-Bulgarian-Russian parallel corpus".
Presentation of the verbs in Bulgarian-Polish electronic dictionaryThis paper briefly discusses the presentation of the verbs in the first electronic Bulgarian-Polish dictionary that is currently being developed under a bilateral collaboration between IMI-BAS and ISS-PAS. Special attention is given to the digital entry classifiers that describe Bulgarian and Polish verbs. Problems related to the correspondence between natural language phenomena and their presentations are discussed. Some examples illustrate the different types of dictionary entries for verbs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.