Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-1016
|View full text |Cite
|
Sign up to set email alerts
|

iMetricGAN: Intelligibility Enhancement for Speech-in-Noise Using Generative Adversarial Network-Based Metric Learning

Abstract: The intelligibility of natural speech is seriously degraded when exposed to adverse noisy environments. In this work, we propose a deep learning-based speech modification method to compensate for the intelligibility loss, with the constraint that the root mean square (RMS) level and duration of the speech signal are maintained before and after modifications. Specifically, we utilize an iMetricGAN approach to optimize the speech intelligibility metrics with generative adversarial networks (GANs). Experimental r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 15 publications
(13 citation statements)
references
References 17 publications
0
12
0
Order By: Relevance
“…Moreover, the signal examplê y, which is pre-enhanced using other reference algorithms (e.g., SSDRC [7] and OptSII [5]), is also fed into D int in the training. As demonstrated in our earlier study [28], learning such additional examples can stabilize the training process and improve performance. Given all the above notations, the loss function of D int is represented as follows:…”
Section: B System Overviewmentioning
confidence: 66%
See 4 more Smart Citations
“…Moreover, the signal examplê y, which is pre-enhanced using other reference algorithms (e.g., SSDRC [7] and OptSII [5]), is also fed into D int in the training. As demonstrated in our earlier study [28], learning such additional examples can stabilize the training process and improve performance. Given all the above notations, the loss function of D int is represented as follows:…”
Section: B System Overviewmentioning
confidence: 66%
“…We comprehensively evaluate the system's performance under different conditions with unseen noises and reverberations. Our experiments show that the improved system significantly increases the intelligibility and quality of speech over our original system [28] with far less parameters. Moreover, it also outperforms the state-of-the-art SSDRC baseline in both objective and subjective evaluations.…”
Section: Introductionmentioning
confidence: 80%
See 3 more Smart Citations