Speech recognition in reverberant and noisy environments employing multiple feature extractors and i-vector speaker adaptation

Alam, Jahangir; Gupta, Vishwa; Kenny, Patrick; Dumouchel, Pierre

doi:10.1186/s13634-015-0238-6

Cited by 11 publications

(6 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, this study used LSTM instead of RNN to encode the English text. LSTM [13] is also a kind of recurrent neural network algorithm. Compared with the traditional RNN, LSTM introduces the structural units of input gate, forgetting gate, and output gate to simulate the phenomena of deep impression and forgetting in the process of human brain memory, thus reducing the unimportant parts in the English text, highlighting the key points, reducing the amount of computation while enhancing the accuracy.…”

Section: Improving Machine Translation With Long Short Term Memorymentioning

confidence: 99%

Research on the Efficiency of Intelligent Algorithm for English Speech Recognition and Sentence Translatione

Zhang¹

2021

IJCAI

View full text Add to dashboard Cite

Machine translation has been gradually widely used to improve the efficiency of English translation. This paper briefly introduced the English speech recognition algorithm based on the back-propagation (BP) neural network algorithm and the machine translation algorithm based on the long short-term memoryrecurrent neural network (LSTM-RNN) algorithm. Then, the machine translation algorithm was simulated and compared with BP-RNN and RNN-RNN algorithms. The results showed that the BP neural network algorithm had a lower word error rate and shorter recognition time compared with the manual recognition approach; the LSTM-RNN-based machine translation algorithm had the lowest error rate for the translation of speech recognition results, and the translation gained the highest rating in the evaluation of ten professional translators.Povzetek: Predstavljena je analiza več pristopov umetne inteligence za prepoznavanje govora in prevajanje.

show abstract

Section: Improving Machine Translation With Long Short Term Memorymentioning

confidence: 99%

Research on the Efficiency of Intelligent Algorithm for English Speech Recognition and Sentence Translatione

Zhang¹

2021

IJCAI

View full text Add to dashboard Cite

show abstract

“…The combined, 60-dimensional features are referred to as LDA+STC [15] and used as an input to the i-vector extractor. The GMM-UBM using full-covariance GMMs with 512 components is trained using Baum-Welch statistics extraction [18]. All the parameters of the trained GMM-UBM are converted into a single supervector, and reduced to 100 dimensional i-vectors using the i-vector extractor (the total variability matrix T).…”

Section: Related Workmentioning

confidence: 99%

Speaker Adaptation Using i-Vector Based Clustering

2020

KSII TIIS

View full text Add to dashboard Cite

show abstract

“…A variety of methods for the analysis of sensor data [1]- [4] and the extraction of meaningful patterns from these data have been proposed in recent decades [5]. Data collected by various sensors such as image, voice, electromyography (EMG) and chemical sensors are used for different applications such as image recognition [6]- [8], speech recognition [9], [10], gesture recognition [11]- [14] and gas classification [15]- [20]. The performance of classification techniques using sensor data varies greatly depending not only on the amount of data collected but also on the quality of the data.…”

Section: Introductionmentioning

confidence: 99%

Data Restoration by Linear Estimation of the Principal Components From Lossy Data

Lee

Choi

2020

IEEE Access

View full text Add to dashboard Cite

In this paper, we propose a method based on principal component analysis (PCA) to restore data after the occurrence of data loss due to sensor defects or environmental factors. In the L2-PCA feature space, the feature vector, which consists of principal components of the data, converges to a point known as the "convergence point" as the extent of data loss increases. Using these characteristics, we approximately linearly estimated the principal components of the original data from the feature vectors of the lossy data. The estimated principal components are used as coefficients in the linear combination of the projection vectors of the PCA feature space for data restoration. The restoration performance of the proposed method is not only superior; the method is also computationally more efficient than other data restoration methods. Experimental results for gas measurement data and facial image data confirm the excellent data restoration performance of the proposed method.

show abstract

Speech recognition in reverberant and noisy environments employing multiple feature extractors and i-vector speaker adaptation

Cited by 11 publications

References 34 publications

Research on the Efficiency of Intelligent Algorithm for English Speech Recognition and Sentence Translatione

Research on the Efficiency of Intelligent Algorithm for English Speech Recognition and Sentence Translatione

Speaker Adaptation Using i-Vector Based Clustering

Data Restoration by Linear Estimation of the Principal Components From Lossy Data

Contact Info

Product

Resources

About