2020
DOI: 10.7220/20.500.12259/240310
|View full text |Cite
|
Sign up to set email alerts
|

English-French-Lithuanian parallel corpus of EU financial documents

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…Lee (2011) used an interesting approach, for Korean and English, to improve financial phrase translation, but the corpora are comparable without being really parallel. There are some parallel corpora about finance, with a limited size, such as Smirnova and Rackevičienė (2020), who made a dataset of European documents in English translated to French and Lithuanian related to finance, but the size is relatively small, consisting of 154 documents from 2010 to 2014. Bick and Barreiro (2015) made a Portuguese-English parallel dataset of about 40,000 sentences in the Legal-Financial domain, coming from a company translation memory.…”
Section: Financementioning
confidence: 99%
See 1 more Smart Citation
“…Lee (2011) used an interesting approach, for Korean and English, to improve financial phrase translation, but the corpora are comparable without being really parallel. There are some parallel corpora about finance, with a limited size, such as Smirnova and Rackevičienė (2020), who made a dataset of European documents in English translated to French and Lithuanian related to finance, but the size is relatively small, consisting of 154 documents from 2010 to 2014. Bick and Barreiro (2015) made a Portuguese-English parallel dataset of about 40,000 sentences in the Legal-Financial domain, coming from a company translation memory.…”
Section: Financementioning
confidence: 99%
“…Concerning Chinese-English, Chang (2004) from Peking University made one of the first large scale Chinese-English parallel corpora from HTML files with alignments at the paragraph and sentence levels, leading to a size of 10 million Chinese characters about different genres (news, technical articles, subtitles). Concerning the domain of finance, there are some small corpora for different pairs of languages, but not Chinese-English, (Arcan, Thomas, de Brandt, & Buitelaar, 2013;Bick & Barreiro, 2015;Smirnova & Rackevičienė, 2020;Tiedemann, 2012;Volk, Amrhein, Aepli, Müller, & Ströbel, 2016). The largest one is the SEDAR dataset, 1 containing 8.6 million French-English sentence pairs in the finance domain from PDF files of the regulations of the province of Quebec (Ghaddar & Langlais, 2020).…”
Section: Introductionmentioning
confidence: 99%