The Speaker and Language Recognition Workshop (Odyssey 2016) 2016
DOI: 10.21437/odyssey.2016-53
|View full text |Cite
|
Sign up to set email alerts
|

From Features to Speaker Vectors by means of Restricted Boltzmann Machine Adaptation

Abstract: Restricted Boltzmann Machines (RBMs) have shown success in different stages of speaker recognition systems. In this paper, we propose a novel framework to produce a vector-based representation for each speaker, which will be referred to as RBMvector. This new approach maps the speaker spectral features to a single fixed-dimensional vector carrying speaker-specific information. In this work, a global model, referred to as Universal RBM (URBM), is trained taking advantage of RBM unsupervised learning capabilitie… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
28
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
1
1

Relationship

5
1

Authors

Journals

citations
Cited by 11 publications
(28 citation statements)
references
References 19 publications
0
28
0
Order By: Relevance
“…Speaker embedding is often referred to a single low dimensional vector representation of the speaker characteristics from a speech signal extracted using a Neural Network (NN) model. For textindependent Speaker Recognition (SR), which is the focus of this work, these models can be trained either in a supervised (e.g., [1,2]) or in an unsupervised (e.g., [3,4]) fashion. Supervised speaker embeddings are produced by training a deep architecture using speaker-labeled background data.…”
Section: Introductionmentioning
confidence: 99%
“…Speaker embedding is often referred to a single low dimensional vector representation of the speaker characteristics from a speech signal extracted using a Neural Network (NN) model. For textindependent Speaker Recognition (SR), which is the focus of this work, these models can be trained either in a supervised (e.g., [1,2]) or in an unsupervised (e.g., [3,4]) fashion. Supervised speaker embeddings are produced by training a deep architecture using speaker-labeled background data.…”
Section: Introductionmentioning
confidence: 99%
“…They also make use of DBNs at the backend in the i-vector framework for speaker verification [10]. As a continuation to these works, a successful attempt was made in our previous work to apply RBMs as a front-end for learning a fixed dimensional speaker representation [11]. This vector representation of speaker was referred to as RBM vector.…”
Section: Introductionmentioning
confidence: 99%
“…This vector representation of speaker was referred to as RBM vector. In [11] it has been shown that the RBM vector preserves speaker specific information and has shown com-petitive results as compared to the conventional i-vector based speaker verification systems. This has lead us to apply the RBM vector for learning speaker representation in the task of speaker clustering.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations