2019
DOI: 10.1007/978-3-030-31372-2_13
|View full text |Cite
|
Sign up to set email alerts
|

A Speech Test Set of Practice Business Presentations with Additional Relevant Texts

Abstract: We present a test corpus of audio recordings and transcriptions of presentations of students' enterprises together with their slides and web-pages. The corpus is intended for evaluation of automatic speech recognition (ASR) systems, especially in conditions where the prior availability of in-domain vocabulary and named entities is benefitable. The corpus consists of 39 presentations in English, each up to 90 seconds long. The speakers are high school students from European countries with English as their secon… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…Antrecorp 33 (Macháček et al, 2019), a test set of up to 90-second mock business presentations given by high school students in very noisy conditions. None of the speakers is a native speaker of English (see the paper for the composition of nationalities) and their English contains many lexical, grammatical and pronunciation errors as well as disfluencies due to the spontaneous nature of the speech.…”
Section: Acknowledgementsmentioning
confidence: 99%
“…Antrecorp 33 (Macháček et al, 2019), a test set of up to 90-second mock business presentations given by high school students in very noisy conditions. None of the speakers is a native speaker of English (see the paper for the composition of nationalities) and their English contains many lexical, grammatical and pronunciation errors as well as disfluencies due to the spontaneous nature of the speech.…”
Section: Acknowledgementsmentioning
confidence: 99%
“…The non-native test set was already used in IWSLT Non-Native Translation Task in 2020 and it is described in Ansari et al ( 2020) Appendix A.6. Specifically, we used the Antrecorp (Macháček et al, 2019; mock business presentations by high-school students) and the auditing presentations (SAO) parts.…”
Section: Time Shift For Better Simultaneitymentioning
confidence: 99%
“…The speakers in TED talks are native. "Non-Native" subset consists of mock business presentations of European high school students (Macháček et al, 2019), and of presentations by representatives of European supreme audit institutions. This subset is described in Findings on page 39 (numbered page 136).…”
Section: Evaluation Datamentioning
confidence: 99%