2021
DOI: 10.1186/s13636-021-00198-4
|View full text |Cite
|
Sign up to set email alerts
|

A CNN-based approach to identification of degradations in speech signals

Abstract: The presence of degradations in speech signals, which causes acoustic mismatch between training and operating conditions, deteriorates the performance of many speech-based systems. A variety of enhancement techniques have been developed to compensate the acoustic mismatch in speech-based applications. To apply these signal enhancement techniques, however, it is necessary to know prior information about the presence and the type of degradations in speech signals. In this paper, we propose a new convolutional ne… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 26 publications
0
4
0
Order By: Relevance
“…Gabor transform or spectrogram is a vital tool in signal analysis [35]- [37]. It excels at identifying signal components, making it valuable for sound differentiation, like in Shazam's music classification [38].…”
Section: B Related Work In Frequency Domain Analysismentioning
confidence: 99%
“…Gabor transform or spectrogram is a vital tool in signal analysis [35]- [37]. It excels at identifying signal components, making it valuable for sound differentiation, like in Shazam's music classification [38].…”
Section: B Related Work In Frequency Domain Analysismentioning
confidence: 99%
“…Other NR tools produce estimates of objective values including FR speech quality values [23], [30], [32], [38], [44], [51], [54], [56], [57], FR speech intelligibility values [30], [32], [38], [44], [52], [54], [56], [57], speech transmission index [22], codec bit-rate [46], and detection of specific impairments, artifacts, or noise types [34], [39], [41], [52]. Some of these tools perform a single task and others perform multiple tasks.…”
Section: A Existing Machine Learning Approachesmentioning
confidence: 99%
“…According to experimental results employing two different speech kinds, namely diseased voice and normal running speech. By highlighting the areas of the log-mel spectrogram that have a greater impact on the target degradation, they can visually see how the network decides to distinguish between different forms of deterioration in speech signals using the score weighted class activation mapping [27].…”
Section: Previous Studiesmentioning
confidence: 99%