2016
DOI: 10.1007/s10579-016-9363-6
|View full text |Cite
|
Sign up to set email alerts
|

Crawl and crowd to bring machine translation to under-resourced languages

Abstract: Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 22 publications
(21 reference statements)
0
2
0
Order By: Relevance
“…In common with other applications of machine learning, human-created data is required for training in vast quantities. Webcrawling for parallel text is a cost-effective option for datagathering and may be useful when high-quality data is scarce (Toral et al 2017). However, neural networks ideally require large amounts of high-quality training data and this is particularly true of NMT since the popularization of the system architecture known as the transformer model, which has exhibited efficiency gains (Vaswani et al 2017).…”
Section: Copyright and Machine Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…In common with other applications of machine learning, human-created data is required for training in vast quantities. Webcrawling for parallel text is a cost-effective option for datagathering and may be useful when high-quality data is scarce (Toral et al 2017). However, neural networks ideally require large amounts of high-quality training data and this is particularly true of NMT since the popularization of the system architecture known as the transformer model, which has exhibited efficiency gains (Vaswani et al 2017).…”
Section: Copyright and Machine Learningmentioning
confidence: 99%
“…i This is, of course, untrue. Notwithstanding the possibility of webcrawling for parallel texts as a method of gathering MT training data (Toral et al 2017), the most valuable and highest-quality data for training MT systems, as with other applications of machine learning based on human data, is aligned human translations, in which source and target texts are computationally linked, usually at the sentence level.…”
Section: Introductionmentioning
confidence: 99%