Enhancing accuracy of long contextual dependencies for Punjabi speech recognition system using deep LSTM

Kadyan, Virender; Dua, Mohit; Dhiman, Poonam

doi:10.1007/s10772-021-09814-2

Cited by 15 publications

(3 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With the great success of deep learning-based methods in speech recognition [10], visual question answering [11] and NLP, scholars have also made some progress in the application of deep learning to continuous SLR [12][13][14]. Many deep learning-based methods have been applied to visual feature extraction and sequence model learning for SLR.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Continuous sign language recognition based on hierarchical memory sequence network

Xue,

Jia,

et al. 2023

IET Computer Vision

View full text Add to dashboard Cite

With the goal of solving the problem of feature extractors lacking strong supervision training and insufficient time information concerning single‐sequence model learning, a hierarchical sequence memory network with a multi‐level iterative optimisation strategy is proposed for continuous sign language recognition. This method uses the spatial‐temporal fusion convolution network (STFC‐Net) to extract the spatial‐temporal information of RGB and Optical flow video frames to obtain the multi‐modal visual features of a sign language video. Then, in order to enhance the temporal relationships of visual feature maps, the hierarchical memory sequence network is used to capture local utterance features and global context dependencies across time dimensions to obtain sequence features. Finally, the decoder decodes the final sentence sequence. In order to enhance the feature extractor, the authors adopted a multi‐level iterative optimisation strategy to fine‐tune STFC‐Net and the utterance feature extractor. The experimental results on the RWTH‐Phoenix‐Weather multi‐signer 2014 dataset and the Chinese sign language dataset show the effectiveness and superiority of this method.

show abstract

Section: Related Workmentioning

confidence: 99%

“…Finally, the sequence feature S is obtained using a weighted residual connection and layer normalisation. As shown in Equations ( 9) and (10):…”

Section: Multi-scale Mixing To Enhance Attentionmentioning

confidence: 99%

Continuous sign language recognition based on hierarchical memory sequence network

Xue,

Jia,

et al. 2023

IET Computer Vision

View full text Add to dashboard Cite

show abstract

“…at is, SOP only focuses on the order of sentences and has no influence on the subject [18]. Albert model input needs to add [CLS] at the beginning of the text, and the output corresponds to the input [CLS] vector containing the information coding of the whole sentence, which can be used for text classification tasks [19].…”

Section: Design Intent Recognition Algorithmmentioning

confidence: 99%

Research on Spoken Language Understanding Based on Deep Learning

Yanli

2021

Scientific Programming

View full text Add to dashboard Cite

Aiming at solving the problem that the recognition effect of rare slot values in spoken language is poor, which affects the accuracy of oral understanding task, a spoken language understanding method is designed based on deep learning. The local features of semantic text are extracted and classified to make the classification results match the dialogue task. An intention recognition algorithm is designed for the classification results. Each datum has a corresponding intention label to complete the task of semantic slot filling. The attention mechanism is applied to the recognition of rare slot value information, the weight of hidden state and corresponding slot characteristics are obtained, and the updated slot value is used to represent the tracking state. An auxiliary gate unit is constructed between the upper and lower slots of historical dialogue, and the word vector is trained based on deep learning to complete the task of spoken language understanding. The simulation results show that the proposed method can realize multiple rounds of man-machine spoken language. Compared with the spoken language understanding methods based on cyclic network, context information, and label decomposition, it has higher accuracy and F1 value and has higher practical application value.

show abstract

Research on Soil Moisture Prediction Based on LSTM-Transformer Model

Zhou

Luo

et al. 2023

Communications in Computer and Information Science

View full text Add to dashboard Cite

Enhancing accuracy of long contextual dependencies for Punjabi speech recognition system using deep LSTM

Cited by 15 publications

References 29 publications

Continuous sign language recognition based on hierarchical memory sequence network

Continuous sign language recognition based on hierarchical memory sequence network

Research on Spoken Language Understanding Based on Deep Learning

Research on Soil Moisture Prediction Based on LSTM-Transformer Model

Contact Info

Product

Resources

About