2022
DOI: 10.36227/techrxiv.21187846
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement

Abstract: <p>Convolution-augmented transformers (Conformers) are recently proposed in various speech-domain applications, such as automatic speech recognition (ASR) and speech separation, as they can capture both local and global dependencies. In this paper, we propose a conformer-based metric generative adversarial network (CMGAN) for speech enhancement (SE) in the time-frequency (TF) domain. The generator encodes the magnitude and complex spectrogram information using two-stage conformer blocks to model both tim… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 99 publications
0
2
0
Order By: Relevance
“…In addition, they retain complex spectral structures in the final speech. Feedforward DNNs (FDNNs) [5]- [9], [13], [16], [17], CNNs [18], [19], RNNs [20], [21], Generative Adversarial Network (GANs) [22], [23], and Transformers [24]- [26] are successful DNN approaches for speech enhancement.…”
Section: Ref#mentioning
confidence: 99%
“…In addition, they retain complex spectral structures in the final speech. Feedforward DNNs (FDNNs) [5]- [9], [13], [16], [17], CNNs [18], [19], RNNs [20], [21], Generative Adversarial Network (GANs) [22], [23], and Transformers [24]- [26] are successful DNN approaches for speech enhancement.…”
Section: Ref#mentioning
confidence: 99%
“…CMGAN: Conformer-Based Metric-GAN [23] is a frame-work that operates on both magnitude spectrogram and complex spectrogram components. In addition to a metric discriminator that corrects metric mismatch, it employs conformers that can capture local features along with long-term relationships in both the time and frequency dimensions.…”
Section: Review Of Performancementioning
confidence: 99%