2015
DOI: 10.1007/s10590-015-9176-1
|View full text |Cite
|
Sign up to set email alerts
|

Survey of data-selection methods in statistical machine translation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
15
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 43 publications
(15 citation statements)
references
References 30 publications
0
15
0
Order By: Relevance
“…However, since the size of the data sets that participants must produce in this task is smaller than the number of parallel sentences that are mutual translations, this task is also related to the data selection: selection of a subset of data that maximizes translation quality, avoiding redundancy and matching a given domain (Eetemadi et al, 2015). Instead of the widespread language-model based data selection methods (Axelrod et al, 2011), we replaced words with placeholders in order to not take into account the domain of the text.…”
Section: Related Workmentioning
confidence: 99%
“…However, since the size of the data sets that participants must produce in this task is smaller than the number of parallel sentences that are mutual translations, this task is also related to the data selection: selection of a subset of data that maximizes translation quality, avoiding redundancy and matching a given domain (Eetemadi et al, 2015). Instead of the widespread language-model based data selection methods (Axelrod et al, 2011), we replaced words with placeholders in order to not take into account the domain of the text.…”
Section: Related Workmentioning
confidence: 99%
“…Among different data selection techniques (Eetemadi et al, 2015), in this work, we focus on three particular methods: Cross Entropy Difference (Section 2.1), TF-IDF Data Selection (Section 2.2), and Feature Decay Algorithms (Section 2.3).…”
Section: Data Selection Methodsmentioning
confidence: 99%
“…Eetemadi et al [25] offered a complex survey of data selection methods in machine translation. They also describe works focusing on cross-entropy which has become the most commonly used approach in data selection.…”
Section: Related Workmentioning
confidence: 99%