SIGNFORMER: DeepVision Transformer for Sign Language Recognition

Kothadiya, Deep; Bhatt, Chintan; Saba, Tanzila; Khan, Azmat Ullah

doi:10.1109/access.2022.3231130

Cited by 38 publications

(9 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The computational features that have been previously described are processed by a three-layer MLP [ 27 ] network to detect the liveness of fingerprints. For the sake of clarity, a description of the training procedure for the Dual Attention model is provided in the form of Algorithm 1 down below.…”

Section: Methodsmentioning

confidence: 99%

Enhancing Fingerprint Liveness Detection Accuracy Using Deep Learning: A Comprehensive Study and Novel Approach

et al. 2023

Self Cite

View full text Add to dashboard Cite

Liveness detection for fingerprint impressions plays a role in the meaningful prevention of any unauthorized activity or phishing attempt. The accessibility of unique individual identification has increased the popularity of biometrics. Deep learning with computer vision has proven remarkable results in image classification, detection, and many others. The proposed methodology relies on an attention model and ResNet convolutions. Spatial attention (SA) and channel attention (CA) models were used sequentially to enhance feature learning. A three-fold sequential attention model is used along with five convolution learning layers. The method’s performances have been tested across different pooling strategies, such as Max, Average, and Stochastic, over the LivDet-2021 dataset. Comparisons against different state-of-the-art variants of Convolutional Neural Networks, such as DenseNet121, VGG19, InceptionV3, and conventional ResNet50, have been carried out. In particular, tests have been aimed at assessing ResNet34 and ResNet50 models on feature extraction by further enhancing the sequential attention model. A Multilayer Perceptron (MLP) classifier used alongside a fully connected layer returns the ultimate prediction of the entire stack. Finally, the proposed method is also evaluated on feature extraction with and without attention models for ResNet and considering different pooling strategies.

show abstract

Section: Methodsmentioning

confidence: 99%

Enhancing Fingerprint Liveness Detection Accuracy Using Deep Learning: A Comprehensive Study and Novel Approach

et al. 2023

Self Cite

View full text Add to dashboard Cite

show abstract

“…This model has 3.8 million trainable parameters and an average inference time of 2.23 s, which is comparatively more than the proposed model. The vision transformer in [34] used eight layers of the encoder with four heads with seven million parameters, which is computationally expensive. The authors in [21] have used ensemble network architecture employing ResNet50 with an attention module with more training parameters and epochs.…”

Section: Time Complexity and Order Of The Proposed Methodsmentioning

confidence: 99%

Fusion of Attention-Based Convolution Neural Network and HOG Features for Static Sign Language Recognition

Kumari,

Anand

2023

Applied Sciences

View full text Add to dashboard Cite

The deaf and hearing-impaired community expresses their emotions, communicates with society, and enhances the interaction between humans and computers using sign language gestures. This work presents a strategy for efficient feature extraction that uses a combination of two different methods that are the convolutional block attention module (CBAM)-based convolutional neural network (CNN) and standard handcrafted histogram of oriented gradients (HOG) feature descriptor. The proposed framework aims to enhance accuracy by extracting meaningful features and resolving issues like rotation, similar hand orientation, etc. The HOG feature extraction technique provides a compact feature representation that signifies meaningful information about sign gestures. The CBAM attention module is incorporated into the structure of CNN to enhance feature learning using spatial and channel attention mechanisms. Then, the final feature vector is formed by concatenating these features. This feature vector is provided to the classification layers to predict static sign gestures. The proposed approach is validated on two publicly available static Massey American Sign Language (ASL) and Indian Sign Language (ISL) databases. The model’s performance is evaluated using precision, recall, F1-score, and accuracy. Our proposed methodology achieved 99.22% and 99.79% accuracy for the ASL and ISL datasets. The acquired results signify the efficiency of the feature fusion and attention mechanism. Our network performed better in accuracy compared to the earlier studies.

show abstract

“…Several methods have been proposed domestically and internationally, from traditional to deep learning-based. Deep learning (DL), a crucial segment of machine learning, has gained prominence in medical image analysis, driving the pursuit of artificial intelligence (AI) in medical imaging [15], [16]. It significantly contributes to computer vision and medical image analysis, using neural networks to aid specialists in diagnosis, reduce radiologists' workload, and enhance efficiency.…”

Section: Medical Research Heavily Relies On Medical Image Analysis a ...mentioning

confidence: 99%

Recent Advancements and Future Prospects in Active Deep Learning for Medical Image Segmentation and Classification

Mahmood,

Rehman,

Saba

et al. 2023

IEEE Access

Self Cite

View full text Add to dashboard Cite

Medical images are helpful for the diagnosis, treatment, and evaluation of diseases. Precise medical image segmentation improves diagnosis and decision-making, aiding intelligent medical services for better disease management and recovery. Due to the unique nature of medical images, image segmentation algorithms based on deep learning face problems such as sample imbalance, edge blur, false positives, and false negatives. In view of these problems, researchers primarily improve the network structure but rarely improve from the unstructured aspect. The paper tackles these challenges, accentuating the limitations of deep convolutional neural network-based methods and proposing solutions to reduce annotation costs, particularly in complex images, and introduces the improvement strategies to solve the problems of sample imbalance, edge blur, false positives, and false negatives. Additionally, the article introduces the latest deep learning-based applications in medical image analysis, covering segmentation, image acquisition, enhancement, registration, and classification. Moreover, the article provides an overview of four cutting-edge deep learning models, namely convolutional neural network (CNN), deep belief network (DBN), stacked autoencoder (SAE), and recurrent neural network (RNN). The study selection involved searching benchmark academic databases, collecting relevant literature and appropriate indicator for analysis, emphasizing DL-based segmentation and classification approaches, and evaluating performance metrics. The research highlights clinicians' and scholars' obstacles in developing an efficient and accurate malignancy prognostic framework based on state-of-the-art deep-learning algorithms. Furthermore, future perspectives are explored to overcome challenges and advance the field of medical image analysis.

show abstract

SIGNFORMER: DeepVision Transformer for Sign Language Recognition

Cited by 38 publications

References 20 publications

Enhancing Fingerprint Liveness Detection Accuracy Using Deep Learning: A Comprehensive Study and Novel Approach

Enhancing Fingerprint Liveness Detection Accuracy Using Deep Learning: A Comprehensive Study and Novel Approach

Fusion of Attention-Based Convolution Neural Network and HOG Features for Static Sign Language Recognition

Recent Advancements and Future Prospects in Active Deep Learning for Medical Image Segmentation and Classification

Contact Info

Product

Resources

About