2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010
DOI: 10.1109/icassp.2010.5494998
|View full text |Cite
|
Sign up to set email alerts
|

Spoken language translation from parallel speech audio: Simultaneous interpretation as SLT training data

Abstract: In recent work, we proposed an alternative to parallel text as translation model (TM) training data: audio recordings of parallel speech (pSp), as it occurs in any communication scenario where interpreters are involved. Although interpretation compares poorly to translation, we reported surprisingly strong translation results for systems based on pSp trained TMs. This work extends the use of pSp as a data source for unsupervised training of all major models involved in statistical spoken language translation. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2010
2010
2021
2021

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…For instance, in Table 1 the interpretation sentence drops "at this point" and condenses "seriousness of this line of argument" to "agreement"; it delivers the source message as reliably as the offline translation. Prior work attempted to build interpretation corpora in a small scale (Tohyama and Inagaki, 2004;Shimizu et al, 2014;Bernardini et al, 2016), or constructed speech interpretation training corpora for MT tasks (Paulik and Waibel, 2010). But, very little attempt has been made on empirically quantifying the evaluation gap.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, in Table 1 the interpretation sentence drops "at this point" and condenses "seriousness of this line of argument" to "agreement"; it delivers the source message as reliably as the offline translation. Prior work attempted to build interpretation corpora in a small scale (Tohyama and Inagaki, 2004;Shimizu et al, 2014;Bernardini et al, 2016), or constructed speech interpretation training corpora for MT tasks (Paulik and Waibel, 2010). But, very little attempt has been made on empirically quantifying the evaluation gap.…”
Section: Introductionmentioning
confidence: 99%
“…Shimizu et al (2013) shows that this approach improves the speed-accuracy tradeoff. However, existing parallel simultaneous interpretation corpora (Shimizu et al, 2014;Matsubara et al, 2002;Bendazzoli and Sandrelli, 2005) are often small, and collecting new data is expensive due to the inherent costs of recording and transcribing speeches (Paulik and Waibel, 2010). In addition, due to the intense time pressure during interpretation, human interpretation has the disadvantage of simpler, less precise diction (Camayd-Freixas, 2011;Al-Khanji et al, 2000) compared to human translations done at the translator's leisure, allowing for more introspection and precise word choice.…”
Section: Introductionmentioning
confidence: 99%