Automatic Optimization of Speech Decoder Parameters

Hannani, Asmaa El; Hain, Thomas

doi:10.1109/lsp.2009.2033967

Cited by 13 publications

(14 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…As grid search for such parameters is very costly we have made use of gradient based [32] to find the best possible configurations. As we found clear dependence of such parameters on the acoustic and language models used, optimal operating curves were generated for each acoustic model/language model combination.…”

Section: Decodingmentioning

confidence: 99%

Transcribing Meetings With the AMIDA Systems

Hain

Burget

Dines

et al. 2012

IEEE Trans. Audio Speech Lang. Process.

112

View full text Add to dashboard Cite

Section: Decodingmentioning

confidence: 99%

Transcribing Meetings With the AMIDA Systems

Hain

Burget

Dines

et al. 2012

IEEE Trans. Audio Speech Lang. Process.

112

View full text Add to dashboard Cite

“…The proposed method significantly reduces computational costs in compared to [2], and the reduction is even greater compared to grid search. In contrast to [3] and [4], Simplified SPSA takes into account the real-time factor, which is of vital importance for the design of an ASR system.…”

Section: Simplified Simultaneous Perturbation Stochastic Approximatiomentioning

confidence: 99%

“…If causes a deterioration of the objective value, the optimal solution must stay at and at the next iteration obtain the estimation of the loss function with a new according to (2). Without an appropriate step size, the optimal solution will stay at forever, which significantly slows down the rate of convergence of the algorithm [8].…”

Section: Simplified Spsamentioning

confidence: 99%

“…In contrast to [3] and [4], Simplified SPSA takes into account the real-time factor, which is of vital importance for the design of an ASR system. The proposed method also requires lower computational costs than [1] and [2] for finding the optimal accuracy corresponding to a specific real-time factor. We introduce a penalty function, which is used to achieve a balance between recognition accuracy and decoding time.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Simplified Simultaneous Perturbation Stochastic Approximation for the Optimization of Free Decoding Parameters

Romanenko

Zatvornitsky

Medennikov

2014

Speech and Computer

View full text Add to dashboard Cite

Abstract. This paper deals with automatic optimization of free decoding parameters. We propose using a Simplified Simultaneous Perturbation Stochastic Approximation algorithm to optimize these parameters. This method provides a significant reduction in computational and labor costs. We also demonstrate that the proposed method successfully copes with the optimization of parameters for a specific target real-time factor, for all the databases we tested.Keywords: Simplified Simultaneous Perturbation Stochastic Approximation, SPSA, decoding parameter, real-time factor, RTF, speech recognition. IntroductionThe balance of accuracy and speed of automatic speech recognition depends on the solution of a number of related tasks, such as:─ optimization of the acoustic model; ─ optimization of the language model; ─ optimization of a large set of free decoding parameters.Optimization of both the acoustic model and the language model in automatic speech recognition for large vocabularies is a well-known task [1]. In contrast, the problem of optimizing free decoding parameters is still often solved manually or by using grid search (i.e. searching for values in a grid with a specified step). The task is complicated by the fact that each parameter can have a different impact on the accuracy of speech recognition and/or the expected decoding time. Moreover, each new domain requires searching for new optimal decoding parameters every time we change the training data. Lastly, changing hardware configuration also requires adjustment of optimal decoding parameters. Simplified Simultaneous Perturbation Stochastic Approximation for the Optimization 403Typically, the search for optimal decoding parameters that satisfy the constraints of the real-time factor and at the same time provide high recognition accuracy is a very time-consuming task.In this paper, we present a Simplified Simultaneous Perturbation Stochastic Approximation for optimizing free decoding parameters. The proposed method significantly reduces computational costs in compared to [2], and the reduction is even greater compared to grid search. In contrast to [3] and [4], Simplified SPSA takes into account the real-time factor, which is of vital importance for the design of an ASR system. The proposed method also requires lower computational costs than [1] and [2] for finding the optimal accuracy corresponding to a specific real-time factor. We introduce a penalty function, which is used to achieve a balance between recognition accuracy and decoding time. Then we demonstrate that this method provides robust and fast results. We present results obtained on three speech databases comprising spontaneous and read speech. Simultaneous Perturbation Stochastic Approximation (SPSA)Let us start by describing the standard form of the SPSA algorithm [5]. We denote the vector of free decoding parameters as . Let denote the estimate for at the th iteration. Then the algorithm has the standard form:where · is an estimate for the gradient at the th iteration. The gain sequence satisfies c...

show abstract

“…For timeconstrained optimization, we present our own loss function strategies that are real-time factor (RTF) aware and compare the results to the Gradient Descent [4] method.…”

Section: Introductionmentioning

confidence: 99%

Gradient-free decoding parameter optimization on automatic speech recognition

Nguyen

Stein

Stadtschnitzer

2014

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

Finding the optimal decoding parameters in speech recognition is often done manually in a rather tedious manner, although automatic gradient-free optimization techniques have been shown to perform quite well for this task. While there have been recent scientific contributions in this field, no thorough comparison of possible methods, in terms of convergence speed and performance, has been undertaken. In this paper, we conduct a series of experiments with three decoding paradigms and four different optimization techniques found in recent literature, both on unconstrained and time-constrained decoder optimization. We offer our findings on the German Difficult Speech Corpus and on the LinkedTV test sets

show abstract

Automatic Optimization of Speech Decoder Parameters

Cited by 13 publications

References 14 publications

Transcribing Meetings With the AMIDA Systems

Transcribing Meetings With the AMIDA Systems

Simplified Simultaneous Perturbation Stochastic Approximation for the Optimization of Free Decoding Parameters

Gradient-free decoding parameter optimization on automatic speech recognition

Contact Info

Product

Resources

About