2015
DOI: 10.1109/taslp.2015.2430815
|View full text |Cite
|
Sign up to set email alerts
|

Use of Micro-Modulation Features in Large Vocabulary Continuous Speech Recognition Tasks

Abstract: Most of the state-of-the-art ASR systems take as input a single type of acoustic features, dominated by the traditional feature schemes, i.e., MFCCs or PLPs. However, these features cannot model rapid, intra-frame phenomena present in the actual speech signals. On the other hand, micro-modulation components, inspired by the AM-FM speech model, can capture these important characteristics of spoken speech, resulting in significant performance improvements, as previously shown in small-vocabulary ASR tasks. Yet, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
9
0
4

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(13 citation statements)
references
References 43 publications
0
9
0
4
Order By: Relevance
“…It was demonstrated that the joined framework, regardless of the low precision of the various levelled TDNN, accomplishes a WRR reduction of 15% according to cutting-edge HMM framework. The author in [7] investigated the exhibition of SR system with the customary Cepstral features when utilizing the linear feature transforms. This combination of features is used to model the DNN-HMM system.…”
Section: Literature Surveymentioning
confidence: 99%
“…It was demonstrated that the joined framework, regardless of the low precision of the various levelled TDNN, accomplishes a WRR reduction of 15% according to cutting-edge HMM framework. The author in [7] investigated the exhibition of SR system with the customary Cepstral features when utilizing the linear feature transforms. This combination of features is used to model the DNN-HMM system.…”
Section: Literature Surveymentioning
confidence: 99%
“…Here, we have used a linearly-scaled Gabor filterbank to obtain the subband filtered signals. The AM-FM modulation features corresponding to the i th subband filtered signal are extracted from instantaneous frequency fi(t) and amplitude envelope ai(t), where i=1,2,...., L, and L is the number of subband filtered signals [31], i.e.,…”
Section: Proposed Feature Extractionmentioning
confidence: 99%
“…More recently, some of these approaches were revised in the framework of Deep Neural Networks (DNNs) where non-linear modeling is feasible. Networks are trained to extract bottleneck features [5], and combine channels [12], achieving similar or better results compared to beamforming. However, training DNNs on multi-style and multi-channel data [20] is the This research work was supported by the EU under the project I-SUPPORT with grant H2020-643666.…”
Section: Introductionmentioning
confidence: 99%
“…Their fusion exhibits robustness in noise and mismatch training/testing conditions (e.g., in Aurora-4 task), as indicated by the single-channel ASR results in recent works [5], [16]. However, only a few works [19], [15] examine their performance in reverberant environments.…”
Section: Introductionmentioning
confidence: 99%