Evaluating Transformers for Lightweight Action Recognition

Koot, Raivo E.; Hennerbichler, Markus; Lu, Haiping

doi:10.48550/arxiv.2111.09641

Cited by 2 publications

(2 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In their evaluation of previous work, Koot et al [ 88 ] discovered that CNN performs better than Transformer when it comes to latency accuracy on lightweight datasets. CNN is also described to capture inductive bias, which is also known as prior knowledge, such as translation equivariance and localization while having pooling operation give partial scale invariance [ 89 ].…”

Section: Comparative Study Between Cnn Vision Transformer and Hybrid ...mentioning

confidence: 99%

Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?

Moutik

Sekkat

Tigani

et al. 2023

Sensors

View full text Add to dashboard Cite

Understanding actions in videos remains a significant challenge in computer vision, which has been the subject of several pieces of research in the last decades. Convolutional neural networks (CNN) are a significant component of this topic and play a crucial role in the renown of Deep Learning. Inspired by the human vision system, CNN has been applied to visual data exploitation and has solved various challenges in various computer vision tasks and video/image analysis, including action recognition (AR). However, not long ago, along with the achievement of the transformer in natural language processing (NLP), it began to set new trends in vision tasks, which has created a discussion around whether the Vision Transformer models (ViT) will replace CNN in action recognition in video clips. This paper conducts this trending topic in detail, the study of CNN and Transformer for Action Recognition separately and a comparative study of the accuracy-complexity trade-off. Finally, based on the performance analysis’s outcome, the question of whether CNN or Vision Transformers will win the race will be discussed.

show abstract

Section: Comparative Study Between Cnn Vision Transformer and Hybrid ...mentioning

confidence: 99%

Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?

Moutik

Sekkat

Tigani

et al. 2023

Sensors

View full text Add to dashboard Cite

show abstract

“…Previously, Koot et al [120] discovered that CNN performs better than Transformer when it comes to latency accuracy on lightweight datasets. However, CNN has a few weaknesses, including a slowness that is brought on by the max pooling operation; additionally, in contrast to the Transformer, it does not consider several perspectives that can be gained by learning, [121] which leads to disregard for global knowledge.…”

Section: The Roles Of Transformers In Predicting the Use Of Drug Comb...mentioning

confidence: 99%

Potential roles of transformers in brain tumor diagnosis and treatment

Lan

Zou

Qin

et al. 2023

Brain-X

View full text Add to dashboard Cite

Brain tumor (BT) is one of many malignancies that have substantially enhanced global human morbidity and mortality rates. Early detection and characterization of glioma are essential for effective preventive strategies. Currently, the use of Transformers, a deep learning model for BT diagnosis and treatment, is attracting significant attention. The transformer self‐attention mechanism automatically learns the associations between input data for efficient processing and analysis. Research indicates that Transformers could play an essential role in the BT segmentation of magnetic resonance imaging (MRI) images, the MRI and histopathology‐based grading of brain cancer, BT molecular expression prediction, the classification of primary brain metastasis sites, voxel‐level dose and BT radiotherapy outcome prediction, synergistic prediction, and the pathway deconvolution of drug combinations. In this review, the feasibility, accuracy, and applicability of various algorithms are systematically analyzed and their prospects are discussed. Overall, this review aimed to discuss and provide an overview of the increasing applications of Transformers in real‐time BT detection and therapy, indicating their broad prospects and potential. In the future, Transformers are expected to be increasingly used for the diagnosis and subsequent treatment of BT because of the continuous development and improvement of Transformer‐based deep learning technology. However, more work is required to investigate their properties for anomaly detection, medical image classification, network design development, and application to other medical data.

show abstract

Evaluating Transformers for Lightweight Action Recognition

Cited by 2 publications

References 30 publications

Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?

Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?

Potential roles of transformers in brain tumor diagnosis and treatment

Contact Info

Product

Resources

About