2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) 2016
DOI: 10.1109/iscslp.2016.7918400
|View full text |Cite
|
Sign up to set email alerts
|

Deep neural network for robust speech recognition with auxiliary features from laser-Doppler vibrometer sensor

Abstract: Recently, the signal captured from a laser Doppler vibrometer (LDV) sensor been used to improve the noise robustness automatic speech recognition (ASR) systems by enhancing the acoustic signal prior to feature extraction. This study proposes another approach in which auxiliary features extracted from the LDV signal are used alongside conventional acoustic features to further improve ASR performance based on the use of a deep neural network (DNN) as the acoustic model. While this approach is promising, the best… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 13 publications
0
1
0
Order By: Relevance
“…In addition, each object has its own unique frequency response, so when the irradiated object is unknown, it is impossible to determine the best parameters for the filter. Xie et al proposed a method that encodes the mapping between LDV-observed speech and clean speech as an auxiliary feature to improve the speech recognition accuracy in a noisy environment, but this method still does not solve the problem of degradation caused by the vibration characteristics of the irradiated object [15].…”
Section: Introductionmentioning
confidence: 99%
“…In addition, each object has its own unique frequency response, so when the irradiated object is unknown, it is impossible to determine the best parameters for the filter. Xie et al proposed a method that encodes the mapping between LDV-observed speech and clean speech as an auxiliary feature to improve the speech recognition accuracy in a noisy environment, but this method still does not solve the problem of degradation caused by the vibration characteristics of the irradiated object [15].…”
Section: Introductionmentioning
confidence: 99%