2015
DOI: 10.11649/cs.2015.022
|View full text |Cite
|
Sign up to set email alerts
|

Extraction and Presentation of Bilingual Correspondences from Slovak-Bulgarian Parallel Corpus

Abstract: Extraction and Presentation of Bilingual Correspondences from Slovak-Bulgarian Parallel CorpusIn this paper the results of the automatic extraction and presentation of bilingual correspondences from Slovak-Bulgarian Parallel corpus are described. The equivalent phrases are extracted from sentence and word level automatically aligned corpus, filtered, indexed and presented in a dictionary-like interface. The bilingual dictionary database contains 80 thousand phrase pairs consisting of approximately 350 thousand… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 8 publications
0
3
0
Order By: Relevance
“…In several cases, the bilingual lists of MWUs were compiled to improve statistical machine translation of an existing machine translation system (Arcan et al , 2017; Bouamor et al , 2012; Irvine and Callison-Burch, 2016; Naguib, 2016; Oliver, 2017; Semmar, 2018; Tsvetkov and Wintner, 2010), for the development of an existing language resource in a target language on the basis of a corresponding resource in a source language – examples include the development of the Slovenian WordNet (Vintar and Fišer, 2008) based on the English WordNet and the development of the bilingual terminology based on the aligned corpora for the library and information science domain (Krstev et al , 2018) – or for the presentation of bilingual correspondences between two languages – for example, correspondences between Slovak-Bulgarian parallel corpus (Garabík and Dimitrova, 2015).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…In several cases, the bilingual lists of MWUs were compiled to improve statistical machine translation of an existing machine translation system (Arcan et al , 2017; Bouamor et al , 2012; Irvine and Callison-Burch, 2016; Naguib, 2016; Oliver, 2017; Semmar, 2018; Tsvetkov and Wintner, 2010), for the development of an existing language resource in a target language on the basis of a corresponding resource in a source language – examples include the development of the Slovenian WordNet (Vintar and Fišer, 2008) based on the English WordNet and the development of the bilingual terminology based on the aligned corpora for the library and information science domain (Krstev et al , 2018) – or for the presentation of bilingual correspondences between two languages – for example, correspondences between Slovak-Bulgarian parallel corpus (Garabík and Dimitrova, 2015).…”
Section: Related Workmentioning
confidence: 99%
“…Some of these approaches rely on the existence of a seed lexicon (Semmar, 2018; Tsvetkov and Wintner, 2010; Xu et al , 2015) or existing translation memories and phrase tables (Oliver, 2017), while in some cases the existence of additional resources, in addition to the input corpus, is not required (Arcan et al , 2017; Bouamor et al , 2012; Garabík and Dimitrova, 2015; Naguib, 2016). Some approaches require parallel sentence-aligned data (Arcan et al , 2017; Bouamor et al , 2012; Garabík and Dimitrova, 2015; Semmar, 2018; Zhang and Wu, 2012), while others perform the extraction on comparable corpora (Hazem and Morin, 2016; Pinnis et al , 2012; Xu et al , 2015). The technique employed in Naguib (2016), used groups of aligned sentences (verses).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation