2022
DOI: 10.36227/techrxiv.21187846.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement

Abstract: <p>Convolution-augmented transformers (Conformers) are recently proposed in various speech-domain applications, such as automatic speech recognition (ASR) and speech separation, as they can capture both local and global dependencies. In this paper, we propose a conformer-based metric generative adversarial network (CMGAN) for speech enhancement (SE) in the time-frequency (TF) domain. The generator encodes the magnitude and complex spectrogram information using two-stage conformer blocks to model both tim… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(1 citation statement)
references
References 5 publications
0
0
0
Order By: Relevance
“…Most of the existing research work is based on a single channel network to establish the transformation between models, so its adaptability is not strong [13]. For the acquisition of multimodal information, CMGAN has been proposed by researchers [14]. In other words, an adversarial training network (CMGAN) is used to establish the sharing characteristics of multi-modal information, and then establish the correlation relationship between various types of heterogeneous information.…”
Section: Generate Adversarial Network Across Modesmentioning
confidence: 99%
“…Most of the existing research work is based on a single channel network to establish the transformation between models, so its adaptability is not strong [13]. For the acquisition of multimodal information, CMGAN has been proposed by researchers [14]. In other words, an adversarial training network (CMGAN) is used to establish the sharing characteristics of multi-modal information, and then establish the correlation relationship between various types of heterogeneous information.…”
Section: Generate Adversarial Network Across Modesmentioning
confidence: 99%