2021
DOI: 10.1051/e3sconf/202133301009
|View full text |Cite
|
Sign up to set email alerts
|

Artificial neural network technology for lips reading

Abstract: The paper presents the use of neural networks for the task of automated speech reading by lips articulation. Speech recognition is performed in two stages. First, a face search is performed and the lips area is selected in a separate frame of the video sequence using Haar features. Then the sequence of frames goes to the input of deep learning convolutional and recurrent neural networks for speech viseme recognition. Experimental studies were carried out using independently obtained videos with Russian-speakin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…A neural network is a collection of interconnected computational nodes inspired by the structure of the human brain [13], [14]. ANN, like the brain's internal network, is made up of a collection of neurons that work together to process and convert incoming data [15], [16]. The term "weight" is used to describe this connection.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…A neural network is a collection of interconnected computational nodes inspired by the structure of the human brain [13], [14]. ANN, like the brain's internal network, is made up of a collection of neurons that work together to process and convert incoming data [15], [16]. The term "weight" is used to describe this connection.…”
Section: Methodsmentioning
confidence: 99%
“…Linear Regression is divided into two parts: simple and multiple linear regression. Simple linear regression is an equation model that uses the relationship of one independent variable/predictor (X) with a dependent variable/response (Y) [15]. The difference in the multiple linear regression method is that the independent variables have more than one variable [16].…”
Section: Methodsmentioning
confidence: 99%
“…Therefore, the end-to-end deep learning architecture is the natural direction of scientific research as opposed to the previous manual feature extraction classification methodology for automatic lip-reading technology. Researchers use convolutional neural networks (CNN) to focus on interest regions in the last few years, and they have also been quite successful at classifying images and detecting targets [8], [9].…”
Section: Article Historymentioning
confidence: 99%