Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?

Moutik, Oumaima; Sekkat, Hiba; Tigani, Smail; Chehri, Abdellah; Saadane, Rachid; Tchakoucht, Taha Ait; Paul, Anand

doi:10.3390/s23020734

Cited by 49 publications

(30 citation statements)

References 131 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…By considering the global information of the image, ViT is more competitive in some visual classification tasks [44,45]. Scholars have proved that ViT can be used in traffic sign classification [46], plant disease detection [47], and face recognition [48]. Vit-based transfer learning begins to attract more and more attention [49][50][51].…”

Section: Transfer Learning Overviewmentioning

confidence: 99%

An instance-based deep transfer learning method for quality identification of Longjing tea from multiple geographical origins

Zhang

Wang

Yan

et al. 2023

Complex Intell. Syst.

View full text Add to dashboard Cite

For practitioners, it is very crucial to realize accurate and automatic vision-based quality identification of Longjing tea. Due to the high similarity between classes, the classification accuracy of traditional image processing combined with machine learning algorithm is not satisfactory. High-performance deep learning methods require large amounts of annotated data, but collecting and labeling massive amounts of data is very time consuming and monotonous. To gain as much useful knowledge as possible from related tasks, an instance-based deep transfer learning method for the quality identification of Longjing tea is proposed. The method mainly consists of two steps: (i) The MobileNet V2 model is trained using the hybrid training dataset containing all labeled samples from source and target domains. The trained MobileNet V2 model is used as a feature extractor, and (ii) the extracted features are input into the proposed multiclass TrAdaBoost algorithm for training and identification. Longjing tea images from three geographical origins, West Lake, Qiantang, and Yuezhou, are collected, and the tea from each geographical origin contains four grades. The Longjing tea from West Lake is regarded as the source domain, which contains more labeled samples. The Longjing tea from the other two geographical origins contains only limited labeled samples, which are regarded as the target domain. Comparative experimental results show that the method with the best performance is the MobileNet V2 feature extractor trained with a hybrid training dataset combined with multiclass TrAdaBoost with linear support vector machine (SVM). The overall Longjing tea quality identification accuracy is 93.6% and 91.5% on the two target domain datasets, respectively. The proposed method can achieve accurate quality identification of Longjing tea with limited samples. It can provide some heuristics for designing image-based tea quality identification systems.

show abstract

Section: Transfer Learning Overviewmentioning

confidence: 99%

An instance-based deep transfer learning method for quality identification of Longjing tea from multiple geographical origins

Zhang

Wang

Yan

et al. 2023

Complex Intell. Syst.

View full text Add to dashboard Cite

show abstract

“…It is a subset of neural networks typically employed in contexts of three dimensional units, namely the height, width and depth used for image analysis and also applies to object classification [43]. Since CNNs can learn directly from the raw time series data, extract features from sequences of observations, and require neither domain expertise nor manually engineered input characteristics this current study plan to utilise it to detect the attributes associated to the activities of elderly people in determining their unusual behavior [44].…”

Section: The Convolutional Neural Network Architecturementioning

confidence: 99%

The Key Criteria for Predicting Unusual Behavior in the Elderly With Deep Learning Models Under 5G Technology

Zamzami

2023

IEEE Access

View full text Add to dashboard Cite

Deep learning algorithms and technology based on 5G networks may be able to help identify unusual behavior in elderly people. Because 5G networks have a lower latency and a greater bandwidth, it is possible to use more complex algorithms and larger data sets for training and detection in a real-time. On top of that, real-time analysis of the data gathered through in-home monitoring of the elderly can become much simpler to carry out with the help of 5G's potential to simplify the process. However, the system needs to be developed in a way that it considers the preferences of the most important criteria of elderly people, who have more requirements in terms of their living situation and the environment in which they live. Therefore, this study advocated using ''Decision Making Trial and Evaluation Laboratory'' (DEMATEL) to analyze the most crucial criteria (feature) required for creating a model for identifying odd behavior in elderly persons. Convolutional Neural Networks (CNNs) and Long Short-Term Memories (LSTMs) are adopted in detecting unusual behavior in the elderly after the analysis key criteria for predicting unusual behavior in the elderly by DEMATEL. The research established a concept by linking the SIMADL dataset with the dimension of elderly people's behavior and performed an experimental analysis using both CNN and LSTM. Performance evaluations show that the LSTM performs better in detecting unusual behavior in elderly persons with 96% accuracy. Depressive disorder is the most significant aspect of ageing that might lead to a typical unusual behavior in elderly persons, according to DEMATEL analysis.

show abstract

“…However, CNN has a few weaknesses, including a slowness that is brought on by the max pooling operation; additionally, in contrast to the Transformer, it does not consider several perspectives that can be gained by learning, [121] which leads to disregard for global knowledge. Because it offers solutions to CNN's numerous weaknesses, Transformer has quickly become CNN's most formidable opponent [122] . The capability of the Transformer to prioritize relevant content while minimizing the repetition of unimportant content is its strength [123] .…”

Section: The Roles Of Transformers In Predicting the Use Of Drug Comb...mentioning

confidence: 99%

“…However, it should be mentioned that both algorithms (i.e., CNN and Transformer) have their own shortcomings and benefits and it is still difficult to determine who will win this race. Nevertheless, the hybrid method, which is the most attractive formula because it enables us to take advantage of a model's strengths while simultaneously reducing the effects of that model's downsides, is more efficient and cost‐effective [122] . It combines CNN with transformers to provide a reliable model.…”

Section: The Roles Of Transformers In Predicting the Use Of Drug Comb...mentioning

confidence: 99%

“…Because it offers solutions to CNN's numerous weaknesses, Transformer has quickly become CNN's most formidable opponent. [122] The capability of the Transformer to prioritize relevant content while minimizing the repetition of unimportant content is its strength. [123] Furthermore, since less demand is placed on the processing power, the visual characteristics are dynamically re-weighted based on the context.…”

Section: The Roles Of Transformers In Predicting the Use Of Drug Comb...mentioning

confidence: 99%

See 1 more Smart Citation

Potential roles of transformers in brain tumor diagnosis and treatment

Lan

Zou

Qin

et al. 2023

Brain-X

View full text Add to dashboard Cite

Brain tumor (BT) is one of many malignancies that have substantially enhanced global human morbidity and mortality rates. Early detection and characterization of glioma are essential for effective preventive strategies. Currently, the use of Transformers, a deep learning model for BT diagnosis and treatment, is attracting significant attention. The transformer self‐attention mechanism automatically learns the associations between input data for efficient processing and analysis. Research indicates that Transformers could play an essential role in the BT segmentation of magnetic resonance imaging (MRI) images, the MRI and histopathology‐based grading of brain cancer, BT molecular expression prediction, the classification of primary brain metastasis sites, voxel‐level dose and BT radiotherapy outcome prediction, synergistic prediction, and the pathway deconvolution of drug combinations. In this review, the feasibility, accuracy, and applicability of various algorithms are systematically analyzed and their prospects are discussed. Overall, this review aimed to discuss and provide an overview of the increasing applications of Transformers in real‐time BT detection and therapy, indicating their broad prospects and potential. In the future, Transformers are expected to be increasingly used for the diagnosis and subsequent treatment of BT because of the continuous development and improvement of Transformer‐based deep learning technology. However, more work is required to investigate their properties for anomaly detection, medical image classification, network design development, and application to other medical data.

show abstract

Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?

Cited by 49 publications

References 131 publications

An instance-based deep transfer learning method for quality identification of Longjing tea from multiple geographical origins

An instance-based deep transfer learning method for quality identification of Longjing tea from multiple geographical origins

The Key Criteria for Predicting Unusual Behavior in the Elderly With Deep Learning Models Under 5G Technology

Potential roles of transformers in brain tumor diagnosis and treatment

Contact Info

Product

Resources

About