2020
DOI: 10.2478/acss-2020-0007
|View full text |Cite
|
Sign up to set email alerts
|

Hand Gesture Recognition in Video Sequences Using Deep Convolutional and Recurrent Neural Networks

Abstract: Deep learning is a new branch of machine learning, which is widely used by researchers in a lot of artificial intelligence applications, including signal processing and computer vision. The present research investigates the use of deep learning to solve the hand gesture recognition (HGR) problem and proposes two models using deep learning architecture. The first model comprises a convolutional neural network (CNN) and a recurrent neural network with a long short-term memory (RNN-LSTM). The accuracy of model ac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 17 publications
(7 citation statements)
references
References 17 publications
0
6
0
1
Order By: Relevance
“…Their model when used on the VIVA challenge dataset, attained classification precision of 77.5%. In [17], the researchers introduced two models for hand gesture recognition. The first model consists of CNN and an RNN-LSTM network.…”
Section: Related Workmentioning
confidence: 99%
“…Their model when used on the VIVA challenge dataset, attained classification precision of 77.5%. In [17], the researchers introduced two models for hand gesture recognition. The first model consists of CNN and an RNN-LSTM network.…”
Section: Related Workmentioning
confidence: 99%
“…They attained an overall accuracy of 85.46 %. Fahad Obaid et al [71] proposed a model comprised of two stages to solve the hand gesture recognition (HGR) problem in Video Sequences. The first stage is preprocessing, while the second is to classify, label the frames and recognize the hand gestures through deep learning.…”
Section: Related Workmentioning
confidence: 99%
“…To shorten the training period, we began reducing the numbers of extracted frames (1-10) and discovered that 5 frames from each video sequence at equal intervals were sufficient to reflect the dynamics of the gestures without compromising the prediction accuracy. Furthermore, previous researches demonstrate that human action recognition may be accomplished by just a few frames (1-7 frames) [77,78].…”
Section: B Pre-processingmentioning
confidence: 99%
“…Jarrín en 2020 [10], usaron una red neuronal convolucional pre entrenada que ha sido entrenada en aproximadamente 1.2 millones de imágenes y el Algoritmo de Estimación de Pose Humana (OpenPose), propone un sistema de detección del ángulo articular en los movimientos de miembro superior para evaluación en fisioterapia mediante visión artificial, éste logró un nivel de confiabilidad del 92,60%, implementado con python y tensorflow. Obaid et al en 2020 [11], utilizaron redes neuronales convolucionales profundas (CNN) y recurrentes (LSTM) para el reconocimiento de gestos con las manos en secuencias de video, en su trabajo crearon dos modelos para determinar la arquitectura óptima de CNN y LSTM. El modelo uno obtuvo un 82% de precisión trabajando con el canal de color y un 89% con el canal de profundidad, sin embargo, el segundo modelo logró 93% de precisión trabajando con el canal de color y profundidad.…”
Section: Trabajos Relacionadosunclassified