2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016
DOI: 10.1109/icassp.2016.7472808
|View full text |Cite
|
Sign up to set email alerts
|

Filterbank learning using Convolutional Restricted Boltzmann Machine for speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
32
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 27 publications
(33 citation statements)
references
References 19 publications
0
32
1
Order By: Relevance
“…Hence, the models learnt on NLSC might not be optimal for individual non-native speaker group. However, the model trained on native English speakers" database represents an optimal auditory code [17], [27] that captured the common traits of non-native speakers. From Table 1, we also observe that the accuracy of handcrafted MFCC+SDC features is highest, i.e., it performs better than our proposed data-driven features (WSJ and AURORA) specifically with SDC.…”
Section: Results On the Development Setmentioning
confidence: 99%
“…Hence, the models learnt on NLSC might not be optimal for individual non-native speaker group. However, the model trained on native English speakers" database represents an optimal auditory code [17], [27] that captured the common traits of non-native speakers. From Table 1, we also observe that the accuracy of handcrafted MFCC+SDC features is highest, i.e., it performs better than our proposed data-driven features (WSJ and AURORA) specifically with SDC.…”
Section: Results On the Development Setmentioning
confidence: 99%
“…The review of different methods for unsupervised filterbank learning is given in [22]. ConvRBM filterbank was shown to perform better than MFCC and Mel filterbank features for speech recognition task [22], [23].…”
Section: Introductionmentioning
confidence: 99%
“…The filterbank learned using ConvRBM was used to extract the features from the genuine and spoofed speech signals. Compared to our earlier works [22,23], here we have used an Adam optimization [24] in ConvRBM training. The experiments on ASV 2015 database shows that ConvRBM-based features perform better than MFCC features.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In equation (13), K ∈ R h ×w ×c ×c represents convolution kernel, b k is a bias after a convolution operation, and f (·) denotes a nonlinear activation function called the rectified linear unit (ReLU) [11] which is shown as follows…”
Section: Convolutional Neural Networkmentioning
confidence: 99%