Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-2232
|View full text |Cite
|
Sign up to set email alerts
|

Lost in Interpreting: Speech Translation from Source or Interpreter?

Abstract: Interpreters facilitate multi-lingual meetings but the affordable set of languages is often smaller than what is needed. Automatic simultaneous speech translation can extend the set of provided languages. We investigate if such an automatic system should rather follow the original speaker, or an interpreter to achieve better translation quality at the cost of increased delay.To answer the question, we release Europarl Simultaneous Interpreting Corpus (ESIC), 10 hours of recordings and transcripts of European P… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 18 publications
(1 reference statement)
0
3
0
Order By: Relevance
“…We test our systems both on IWSLT test data (derived from TED talks) and on the ESIC test set 4 (Macháek et al, 2021). From IWSLT, we use tst2018 for De↔En, and tst2015/tst2016 combined for Cs↔En.…”
Section: Datamentioning
confidence: 99%
“…We test our systems both on IWSLT test data (derived from TED talks) and on the ESIC test set 4 (Macháek et al, 2021). From IWSLT, we use tst2018 for De↔En, and tst2015/tst2016 combined for Cs↔En.…”
Section: Datamentioning
confidence: 99%
“…Evaluation Data For latency and quality analysis, we utilize the dev set of the manually transcribed ESIC corpus (Macháček et al, 2021) for English, German, and Czech ASR containing 179 documents. This corpus contains 5 hours of original English speeches from the European Parliament, including simultaneous interpreting into German and Czech.…”
Section: Benchmarking Settingsmentioning
confidence: 99%
“…We call our implementation Whisper-Streaming, although it is applicable to any model with API similar to Whisper. According to our evaluation, it achieves 3.3 seconds latency on average for English ASR on the European Parliament speech test set ESIC (Macháček et al, 2021), when running on NVIDIA A40 GPU, a fast hardware processing unit. We test it also on German and Czech ASR and present the results and suggestions for the optimal parameters.…”
Section: Introductionmentioning
confidence: 99%