2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2015
DOI: 10.1109/asru.2015.7404805
|View full text |Cite
|
Sign up to set email alerts
|

The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments

Abstract: This paper introduces the contents and the possible usage of the DIRHA-ENGLISH multi-microphone corpus, recently realized under the EC DIRHA project. The reference scenario is a domestic environment equipped with a large number of microphones and microphone arrays distributed in space.The corpus is composed of both real and simulated material, and it includes 12 US and 12 UK English native speakers. Each speaker uttered different sets of phonetically-rich sentences, newspaper articles, conversational speech, k… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
43
0
1

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 47 publications
(46 citation statements)
references
References 27 publications
0
43
0
1
Order By: Relevance
“…Research in this field has made great progress thanks to real speech corpora collected for various application scenarios such as voice command for cars (Hansen et al, 2001), smart homes (Ravanelli et al, 2015), or tablets (Barker et al, 2015), and automatic transcription of lectures (Lamel et al, 1994), meetings (Renals et al, 2008), conversations (Harper, 2015), dialogues (Stupakov et al, 2011), game sessions (Fox et al, 2013), or broadcast media (Bell et al, 2015). In most corpora, the training speakers differ from the test speakers.…”
Section: Introductionmentioning
confidence: 99%
“…Research in this field has made great progress thanks to real speech corpora collected for various application scenarios such as voice command for cars (Hansen et al, 2001), smart homes (Ravanelli et al, 2015), or tablets (Barker et al, 2015), and automatic transcription of lectures (Lamel et al, 1994), meetings (Renals et al, 2008), conversations (Harper, 2015), dialogues (Stupakov et al, 2011), game sessions (Fox et al, 2013), or broadcast media (Bell et al, 2015). In most corpora, the training speakers differ from the test speakers.…”
Section: Introductionmentioning
confidence: 99%
“…For the phrich dataset, in order to better focus on the behaviour of the proposed features in encoding acoustic information, we adopt a pure phone-loop as in [32]. Although this decision yields a loss in overall recognition performance, we avoid certain non-linear behaviours due to the language modelling.…”
Section: A Setup and Datasetsmentioning
confidence: 99%
“…These utterances were reverberated with IRs measured in the living-room environment. As test material we used two different sets, extracted from the DIRHA-English corpus 2 [32], [33]. The first test set corresponds to the WSJ0-5k sub-set, and each of its two subsets i. e., simulated and real, is composed of 409 sentences, uttered by 6 speakers.…”
Section: A Setup and Datasetsmentioning
confidence: 99%
“…The employed DSR corpus [18] includes a large set of one-minute sequences simulating real-life scenarios of speechbased domestic control. The sequences were generated by mixing real and simulated far-field speech with typical domestic background noise.…”
Section: A Dirha-english Corpusmentioning
confidence: 99%
“…Several scientific projects [18], [7] and challenges [8], [10] have been launched during the last decade targeting intelligent interfaces for indoors smart environments. Distant speech recognition (DSR) via distributed microphones is examined in most of them.…”
Section: Introductionmentioning
confidence: 99%