2018
DOI: 10.48550/arxiv.1804.05053
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Voices Obscured in Complex Environmental Settings (VOICES) corpus

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
24
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(24 citation statements)
references
References 3 publications
0
24
0
Order By: Relevance
“…Thus we experimented with VOiCES challenge dataset. VOiCES corpus [15] is released as part of "The voices from a distance challenge 2019" [16] of Interspeech 2019. For the ASR fixed conditons track, the training set consists of 80-hours subset of LibriSpeech corpus…”
Section: Voices Corpus Asrmentioning
confidence: 99%
See 1 more Smart Citation
“…Thus we experimented with VOiCES challenge dataset. VOiCES corpus [15] is released as part of "The voices from a distance challenge 2019" [16] of Interspeech 2019. For the ASR fixed conditons track, the training set consists of 80-hours subset of LibriSpeech corpus…”
Section: Voices Corpus Asrmentioning
confidence: 99%
“…In these experiments, we show that the proposed approach improves over the state-of-the-art ASR systems based on log-mel features as well as other past approaches proposed for speech dereverberation and denoising based on deep learning. In addition, we also extend the approach to large vocabulary speech recognition on VOiCES dataset [15,16].…”
Section: Introductionmentioning
confidence: 99%
“…The training set of the VOiCES corpus [23,40] consists of 80-hour subset of the clean LibriSpeech corpus [44]. The training set has close talking microphone recordings from 427 speakers recorded in clean environments.…”
Section: Datamentioning
confidence: 99%
“…We also explore regularization of the model based on boundary equilibrium generative adversarial networks (BEGAN) [21]. In various E2E ASR experiments performed on the REVERB challenge dataset [22] as well as the VOiCES dataset [23], we show that the proposed approach improves over the stateof-art E2E ASR systems based on log-mel features with generalized (GEV) beamforming and weighted prediction error (WPE) based enhancement.…”
Section: Introductionmentioning
confidence: 95%
“…Recently, far-field speaker recognition attracts more and more attention from the research community. The Voices Obscured in Complex Environmental Settings (VOiCES) Challenge launched in 2019 aims to benchmark state-of-theart speech processing methods in far-field and noisy conditions [24]. The wake-up word dataset Hi Mia has also been released to facilitate the studies in far-field speaker recognition [25].…”
Section: Introductionmentioning
confidence: 99%