2013 IEEE Workshop on Automatic Speech Recognition and Understanding 2013
DOI: 10.1109/asru.2013.6707748
|View full text |Cite
|
Sign up to set email alerts
|

Elastic spectral distortion for low resource speech recognition with deep neural networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
58
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 96 publications
(58 citation statements)
references
References 11 publications
0
58
0
Order By: Relevance
“…Suspicion about simulated data is common in the speech processing community, due for instance to the misleadingly high performance of direction-of-arrival based adaptive beamformers on simulated data compared to real data (Kumatani et al, 2012). Fortunately, this case against simulation does not arise for all techniques: most modern enhancement and ASR techniques can benefit from data augmentation and simulation (Kanda et al, 2013;Brutti and Matassoni, 2016). Few existing datasets involve both real and simulated data.…”
Section: Introductionmentioning
confidence: 99%
“…Suspicion about simulated data is common in the speech processing community, due for instance to the misleadingly high performance of direction-of-arrival based adaptive beamformers on simulated data compared to real data (Kumatani et al, 2012). Fortunately, this case against simulation does not arise for all techniques: most modern enhancement and ASR techniques can benefit from data augmentation and simulation (Kanda et al, 2013;Brutti and Matassoni, 2016). Few existing datasets involve both real and simulated data.…”
Section: Introductionmentioning
confidence: 99%
“…We conjecture that the spectrogram of a noise segment may be a better domain to apply perturbation. A recent study has found that three perturbations on speech samples in the spectrogram domain improve ASR performance (Kanda et al, 2013). These perturbations were used to expand the speech samples so that more speech patterns are observed by a classifier.…”
Section: Noise Perturbationmentioning
confidence: 99%
“…We use the method described in (Kanda et al, 2013) to randomly perturb noise samples. Frequency perturbation takes three steps.…”
Section: Noise Perturbationmentioning
confidence: 99%
See 1 more Smart Citation
“…The work was later followed up by [19]- [21] on large vocabulary continuous speech recognition (LVCSR). Similarly, elastic spectral distortion was investigated in [22] where sparse data was augmented by vocal tract length (VTL) distortion, speech rate distortion and frequency-axis random distortion for DNN-HMM training.…”
Section: Introductionmentioning
confidence: 99%