2020
DOI: 10.1109/taslp.2020.2986896
|View full text |Cite
|
Sign up to set email alerts
|

Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
24
0
6

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1
1

Relationship

1
9

Authors

Journals

citations
Cited by 48 publications
(30 citation statements)
references
References 46 publications
0
24
0
6
Order By: Relevance
“…We find improvements with the both proposed model and loss function in terms of lower word error rate (WER) in the LibriCSS dataset [11]. Note that although our focus in this paper is single-channel separation, our approach can be easily extended to multi-channel processing using masking based beamforming [1,12].…”
Section: Introductionmentioning
confidence: 90%
“…We find improvements with the both proposed model and loss function in terms of lower word error rate (WER) in the LibriCSS dataset [11]. Note that although our focus in this paper is single-channel separation, our approach can be easily extended to multi-channel processing using masking based beamforming [1,12].…”
Section: Introductionmentioning
confidence: 90%
“…Later performance investigation was done using delay and sum beam-forming to decrease the word error rate in identification of speech signal. Also auto regression-based gaussian distribution and Laplacian distribution is used for enhancing speech signal are described in [7][8][9][10][11][12]. The proposed Laplacian prior estimators minimize unnecessary noise signals in desired speech signals.…”
Section: Introductionmentioning
confidence: 99%
“…The speaker attention module extracts the target speaker's voice, that is further encoded by the speaker representation module into a discriminative speaker embedding for effective speaker verification. There have been studies on joint optimization between speech enhancement and speaker verification [32]- [34]. Along a similar line of thought, we propose to jointly optimize a speaker attention module and a speaker representation module by simultaneously minimizing a signal reconstruction loss and a speaker identity loss.…”
Section: Introductionmentioning
confidence: 99%