Structural Sparsification for Far-Field Speaker Recognition with Intel® Gna

Zhang, Jingchi; Huang, Jonathan; Deisher, Michael; Li, Hai; Chen, Yiran

doi:10.1109/icassp40776.2020.9054569

Search citation statements

Order By: Relevance

Paper Sections

Select...

Introduction1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2020

Publication Types

Select...

Other1

Relationship

Self Cite1

Independent0

Authors

Journals

Cited by 1 publication

(1 citation statement)

References 24 publications

(26 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Sharan et al [21] explore random projection for low-rank tensor factorization and describe the use on gene expression and EEG time series data. Zhang et al [22] apply structural sparsification on Time-Delay Neural Networks (TDNN) to remove redundant structures. Alternative approaches are subject to our further research, e.g., binary neural networks as successfully applied to natural language understanding [23].…”

Section: Introductionmentioning

confidence: 99%

Compact Speaker Embedding: lrx-Vector

Georges

Huang

2020

Interspeech 2020

Self Cite

View full text Add to dashboard Cite

Deep neural networks (DNN) have recently been widely used in speaker recognition systems, achieving state-of-the-art performance on various benchmarks. The x-vector architecture is especially popular in this research community, due to its excellent performance and manageable computational complexity. In this paper, we present the lrx-vector system, which is the low-rank factorized version of the x-vector embedding network. The primary objective of this topology is to further reduce the memory requirement of the speaker recognition system. We discuss the deployment of knowledge distillation for training the lrx-vector system and compare against low-rank factorization with SVD. On the VOiCES 2019 far-field corpus we were able to reduce the weights by 28% compared to the full-rank x-vector system while keeping the recognition rate constant (1.83 % EER).

show abstract

Section: Introductionmentioning

confidence: 99%