Multi-Angle Lipreading with Angle Classification-Based Feature Extraction and Its Application to Audio-Visual Speech Recognition

Isobe, Shinnosuke; Tamura, Shinichi; Hayamizu, Satoru; Gotoh, Yuuto; Nose, Masaki

doi:10.3390/fi13070182

Cited by 9 publications

(3 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the process of exercising teachers' educational disciplinary power, the characteristics of teachers' speech will make students' hearing more sensitive and thus also make students analyze the characteristics of teachers' speech through hearing. In this section, the CNN network and LSTM network under deep learning combined with the Gammatone auditory filter are used to simulate students' behaviors of listening to teachers' speech features, which provides a basis for predicting students' auditory emotional changes in the process of exercising the right of educational discipline [27][28]. The teacher's speech signal is first segmented into equal-length segments, and the energy of many segments after segmentation is so low that the human ear can hardly perceive the emotional information inside.…”

Section: Teacher Speech Feature Extraction Based On Cochlear Filteringmentioning

confidence: 99%

Problem Analysis and Legal Protection of the Exercise of Teachers’ Educational Disciplinary Rights Based on the Background of Big Data

Liu

2023

Applied Mathematics and Nonlinear Sciences

View full text Add to dashboard Cite

The right of education and discipline is an important way of school education and teaching management, teachers to fulfill the teaching and educating people, the implementation of the fundamental task of moral education. This paper firstly discusses the dilemma of exercising the right to discipline teachers in education, and also analyzes the legal nature of the right to discipline in education and the impact on the emotional performance of teachers and students in the process of exercising the right. Secondly, cochlear filtering combined with CNN and LSTM network is introduced to extract the speech characteristics of teachers in the process of exercising the right of education and discipline, and a hybrid neural network model is used to realize the recognition and prediction of students’ auditory emotions. Finally, in order to verify the effectiveness of the method of this paper, experimental test analysis was carried out, and a comprehensive rule of law guarantee proposal was given in the process of exercising the right of teachers’ educational discipline. The results show that the maximum value of the intensity of the teacher’s speech signal after processing using the cochlear filter is 78.28dB, and the difference with the original signal is only 0.32%. The accuracy of recognizing students’ auditory emotions reached 90.48% after over 50 iterations. Under the background of big data, the right to discipline teachers in education needs to be analyzed with the help of technology for the data analysis of the appropriateness of exercise, and it is united in a number of aspects, such as strengthening the legislation, standardizing the implementation, strengthening the supervision, and perfecting the relief, as a way to help the comprehensive rule of law operation of the right to discipline teachers in education.

show abstract

Section: Teacher Speech Feature Extraction Based On Cochlear Filteringmentioning

confidence: 99%

Problem Analysis and Legal Protection of the Exercise of Teachers’ Educational Disciplinary Rights Based on the Background of Big Data

Liu

2023

Applied Mathematics and Nonlinear Sciences

View full text Add to dashboard Cite

show abstract

“…The MPEG-7 standard defines 17 time and frequency domain descriptors, among which audio signature features can represent information unique to a piece of audio and, therefore, are often used for audio recognition [26]. In this paper method, audio features are extracted firstly from the dance video by extracting the audio streaming file and secondly from the audio streaming file by extracting audio signature features and constructing an audio dictionary based on the bag-of-words model idea.…”

Section: Audio Feature Extractionmentioning

confidence: 99%

Application of big data technology in traditional dance video movement recognition research

Sun

2023

Applied Mathematics and Nonlinear Sciences

View full text Add to dashboard Cite

This paper proposes a multi-feature fusion approach for action recognition under big data technology with the goal of improving traditional dance video action recognition. By analyzing the basic method of dance action, the extraction process of dance action features is analyzed using both single-layer and hierarchical methods. Multi-feature fusion action recognition is chosen as the main method for action recognition. The image and audio features of the dance video are combined to improve the accuracy of recognizing dance actions. Use the optical flow algorithm to construct a histogram of the optical flow direction. The method’s feasibility is explored by applying the multi-feature fusion recognition method to traditional dance movement recognition. The results show that in traditional dance movement recognition performance, the performance of the method of multi-feature fusion recognition is improved by 7.6% compared to other traditional methods. The multi-feature fusion recognition method has more than 50% accuracy in recognizing different traditional dance movements and similar movements in terms of movement recognition accuracy. To a certain degree, this study enhances the efficiency of traditional dance movement recognition and conserves human and financial resources in dance movement recognition.

show abstract

“…Extracting speaker-specific information from complex speech data is trivial for humans, but a challenging task for computers. With the development of artificial intelligence, deep learning was introduced into the field of speech recognition in 2009 [1][2][3][4]. In just a few years, it has been widely used in speech recognition, speaker recognition, text recognition, emotion recognition and other related fields.…”

Section: Introductionmentioning

confidence: 99%

An Optimal Method for Speech Recognition Based on Neural Network

Ishak¹,

Madsen²,

Al-Zahrani³

2023

Intelligent Automation &Amp; Soft Computing

View full text Add to dashboard Cite

Natural language processing technologies have become more widely available in recent years, making them more useful in everyday situations. Machine learning systems that employ accessible datasets and corporate work to serve the whole spectrum of problems addressed in computational linguistics have lately yielded a number of promising breakthroughs. These methods were particularly advantageous for regional languages, as they were provided with cutting-edge language processing tools as soon as the requisite corporate information was generated. The bulk of modern people are unconcerned about the importance of reading. Reading aloud, on the other hand, is an effective technique for nourishing feelings as well as a necessary skill in the learning process. This paper proposed a novel approach for speech recognition based on neural networks. The attention mechanism is first utilized to determine the speech accuracy and fluency assessments, with the spectrum map as the feature extraction input. To increase phoneme identification accuracy, reading precision, for example, employs a new type of deep speech. It makes use of the exportchapter tool, which provides a corpus, as well as the TensorFlow framework in the experimental setting. The experimental findings reveal that the suggested model can more effectively assess spoken speech accuracy and reading fluency than the old model, and its evaluation model's score outcomes are more accurate.

show abstract

Multi-Angle Lipreading with Angle Classification-Based Feature Extraction and Its Application to Audio-Visual Speech Recognition

Cited by 9 publications

References 23 publications

Problem Analysis and Legal Protection of the Exercise of Teachers’ Educational Disciplinary Rights Based on the Background of Big Data

Problem Analysis and Legal Protection of the Exercise of Teachers’ Educational Disciplinary Rights Based on the Background of Big Data

Application of big data technology in traditional dance video movement recognition research

An Optimal Method for Speech Recognition Based on Neural Network

Contact Info

Product

Resources

About