ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8683713
|View full text |Cite
|
Sign up to set email alerts
|

The Pytorch-kaldi Speech Recognition Toolkit

Abstract: The availability of open-source software is playing a remarkable role in the popularization of speech recognition and deep learning. Kaldi, for instance, is nowadays an established framework used to develop state-of-the-art speech recognizers. PyTorch is used to build neural networks with the Python language and has recently spawn tremendous interest within the machine learning community thanks to its simplicity and flexibility.The PyTorch-Kaldi project aims to bridge the gap between these popular toolkits, tr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
135
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 217 publications
(158 citation statements)
references
References 22 publications
2
135
0
Order By: Relevance
“…This work uses hybrid HMM-DNN speech recognizers. TIMIT and DIRHA experiments are performed with the PyTorch-Kaldi toolkit [35] using a six-layer multi-layer perceptron and a light GRU [36], respectively. The performance reported on TIMIT is the average of the phone error rates (PER%) obtained by running each experiment three times with different seeds.…”
Section: Corpora and Asr Setupmentioning
confidence: 99%
“…This work uses hybrid HMM-DNN speech recognizers. TIMIT and DIRHA experiments are performed with the PyTorch-Kaldi toolkit [35] using a six-layer multi-layer perceptron and a light GRU [36], respectively. The performance reported on TIMIT is the average of the phone error rates (PER%) obtained by running each experiment three times with different seeds.…”
Section: Corpora and Asr Setupmentioning
confidence: 99%
“…The active development of open-source software toolkits plays a significant role in the rapid progress of ASR research, instances include the Kaldi (Povey et al, 2011) and ESPnet (Watanabe et al, 2018). In this work, we demonstrate that state-of-the-art SNN acoustic models can be easily developed in PyTorch and integrated into the PyTorch-Kaldi Speech Recognition Toolkit (Ravanelli et al, 2019). This software toolkit integrates the efficiency of Kaldi and the flexibility of PyTorch, therefore, it can support the rapid development of SNN-based ASR systems.…”
Section: Development Of Snn-based Asr Systemsmentioning
confidence: 73%
“…All ASR experiments are performed using the PyTorch-Kaldi ASR toolkit (Ravanelli et al, 2019). This recently introduced toolkit inherits the flexibility of PyTorch toolkit (Paszke et al, 2017) for ANN-based acoustic model development and the efficiency of Kaldi ASR toolkit (Povey et al, 2011).…”
Section: Implementation Detailsmentioning
confidence: 99%
“…The learning rate is halved every-time the loss on the validation set is below a certain threshold fixed to 0.001 to avoid overfitting. Finally, models are implemented with the Pytorch-Kaldi toolkit [18]. While the effectiveness of QLSTM over LSTM has been demonstrated, an LSTM network trained in the same conditions and based on [5] is considered as a baseline.…”
Section: Model Architecturesmentioning
confidence: 99%