2016 IEEE Spoken Language Technology Workshop (SLT) 2016
DOI: 10.1109/slt.2016.7846244
|View full text |Cite
|
Sign up to set email alerts
|

Performance monitoring for automatic speech recognition in noisy multi-channel environments

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
8
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
6
2

Relationship

5
3

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 15 publications
1
8
0
Order By: Relevance
“…It appears that methods borrowed from ASR become more and more useful in HSR research now that the overall performance gap between humans and machines get smaller (or vanishes for single, well-studied databases [9]). In our experiments, the best correlations are obtained with the M-Measure, which was shown earlier to be clearly related to parameters that influence speech intelligibility in hearing aids, e.g., the optimal direction of a beamformer when spatial filtering is performed in multi-channel hearing aids [19]. This is one example of strategies developed for ASR (specifically stream-weighting in multi-stream ASR) which has a meaningful application in human speech perception (specifically hearing research), as advertised in [20].…”
Section: Discussionsupporting
confidence: 60%
See 1 more Smart Citation
“…It appears that methods borrowed from ASR become more and more useful in HSR research now that the overall performance gap between humans and machines get smaller (or vanishes for single, well-studied databases [9]). In our experiments, the best correlations are obtained with the M-Measure, which was shown earlier to be clearly related to parameters that influence speech intelligibility in hearing aids, e.g., the optimal direction of a beamformer when spatial filtering is performed in multi-channel hearing aids [19]. This is one example of strategies developed for ASR (specifically stream-weighting in multi-stream ASR) which has a meaningful application in human speech perception (specifically hearing research), as advertised in [20].…”
Section: Discussionsupporting
confidence: 60%
“…This would require running a DNNclassifier on hearing aid hardware in real-time. As estimated in [19], a forward run of a standard DNN as used in our experiments is not possible on current hearing aid hardware due to limitations in power consumption. However, when the model complexity is reduced by a factor of 10, such real-time processing becomes feasible.…”
Section: Discussionmentioning
confidence: 99%
“…Before calculating the MTD, the context-dependent triphones from the DNN are grouped to approximately 40 monophones. This allows to visualize the output ( Figure 1), is computationally cheaper, and produces similar results than using triphone activations [15]. Note that a forward-run of the model does not require a decoding step with the HMM or a word transcript, since it relies on the DNN output alone.…”
Section: Speech Quality Prediction Systemmentioning
confidence: 99%
“…This process of enhancing is of great importance for many applications, such as mobile phone communications, VoIP, teleconferencing systems, hearing aids, and automatic speech recognition (ASR) systems. For example, several authors have reported a decrease in the performance of ASR in the presence of noise recently [2][3][4], and there is concern about the performance of devices for hearing aids as well [5,6].…”
Section: Introductionmentioning
confidence: 99%