2020
DOI: 10.1109/taslp.2020.3004760
|View full text |Cite
|
Sign up to set email alerts
|

Variational Domain Adversarial Learning With Mutual Information Maximization for Speaker Verification

Abstract: Domain mismatch is a common problem in speaker verification (SV) and often causes performance degradation. For the system relying on the Gaussian PLDA backend to suppress the channel variability, the performance would be further limited if there is no Gaussianity constraint on the learned embeddings. This paper proposes an information-maximized variational domain adversarial neural network (InfoVDANN) that incorporates an InfoVAE into domain adversarial training (DAT) to reduce domain mismatch and simultaneous… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
1

Relationship

6
2

Authors

Journals

citations
Cited by 28 publications
(11 citation statements)
references
References 30 publications
0
11
0
Order By: Relevance
“…To fully characterize dual translation between two domains, only considering domain mapping is not enough. Representing rich information within individual domains and between two domains is equally important [24]. It is crucial to represent domain mapping as well as domain knowledge and also integrate them in a unified transformer for seq2seq learning.…”
Section: Domain Mapping Knowledge and Dual Integrationmentioning
confidence: 99%
“…To fully characterize dual translation between two domains, only considering domain mapping is not enough. Representing rich information within individual domains and between two domains is equally important [24]. It is crucial to represent domain mapping as well as domain knowledge and also integrate them in a unified transformer for seq2seq learning.…”
Section: Domain Mapping Knowledge and Dual Integrationmentioning
confidence: 99%
“…The new learning machines, e.g. Bayesian recurrent network [10,19], VAE [27], neural variational learning [14,15,31], neural discrete representation [24,34], stochastic layered model [18,21,33], stochastic temporal network [1], temporal difference VAE [23], Markov recurrent network [28], temporal-difference network, and neural ordinary differential equation [3] are introduced in various deep models which pave an avenue to knowledge representation and information discovery. Variational inference methods based on normalizing flows [26,30] and variational mixture of posteriors [32] are addressed.…”
Section: Bayesian Information Processingmentioning
confidence: 99%
“…It has been shown that the xvectors are more speaker discriminative than the i-vectors. They are also more robust to noise, reverberation, and domain mismatch [6][7][8][9]. There are also embedding systems that use ResNets [10] or DenseNets [11] for frame-level processing.…”
Section: Introductionmentioning
confidence: 99%