AA-trans: Core attention aggregating transformer with information entropy selector for fine-grained visual classification

Wang, Qi; Wang, Jianjun; Deng, Hongyu; Wu, Xin; Wang, Shaobin; Hao, Ge‐Fei

doi:10.1016/j.patcog.2023.109547

Cited by 20 publications

(4 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We first compile a list of common model structures to find the best pre-trained models for plant disease diagnosis. As shown in Table 4 , we gather a variety of structural models such as AlexNet [ 8 ], VGGNet [ 9 ] GoogleNet [ 10 ], ResNet [ 21 ], DenseNet [ 42 ], EffecientNet [ 43 ], and others that can be used directly for plant disease classification or as backbone networks for other more complex models [ 44 , 45 ] to implement different tasks such as plant disease detection and segmentation. Furthermore, to facilitate the use of plant disease pre-trained models in plant disease detection and segmentation tasks, we collect the backbone network structures of the YOLO (You Only Look Once) series, MaskRCNN, FCN (Fully Convolutional Network), and DeepLab as part of the list of the pre-trained models.…”

Section: Methodsmentioning

confidence: 99%

PDDD-PreTrain: A Series of Commonly Used Pre-Trained Models Support Image-Based Plant Disease Diagnosis

et al. 2023

Self Cite

View full text Add to dashboard Cite

Plant diseases threaten global food security by reducing crop yield; thus, diagnosing plant diseases is critical to agricultural production. Artificial intelligence technologies gradually replace traditional plant disease diagnosis methods due to their time-consuming, costly, inefficient, and subjective disadvantages. As a mainstream AI method, deep learning has substantially improved plant disease detection and diagnosis for precision agriculture. In the meantime, most of the existing plant disease diagnosis methods usually adopt a pre-trained deep learning model to support diagnosing diseased leaves. However, the commonly used pre-trained models are from the computer vision dataset, not the botany dataset, which barely provides the pre-trained models sufficient domain knowledge about plant disease. Furthermore, this pre-trained way makes the final diagnosis model more difficult to distinguish between different plant diseases and lowers the diagnostic precision. To address this issue, we propose a series of commonly used pre-trained models based on plant disease images to promote the performance of disease diagnosis. In addition, we have experimented with the plant disease pre-trained model on plant disease diagnosis tasks such as plant disease identification, plant disease detection, plant disease segmentation, and other subtasks. The extended experiments prove that the plant disease pre-trained model can achieve higher accuracy than the existing pre-trained model with less training time, thereby supporting the better diagnosis of plant diseases. In addition, our pre-trained models will be open-sourced at https://pd.samlab.cn/ and Zenodo platform https://doi.org/10.5281/zenodo.7856293 .

show abstract

Section: Methodsmentioning

confidence: 99%

PDDD-PreTrain: A Series of Commonly Used Pre-Trained Models Support Image-Based Plant Disease Diagnosis

et al. 2023

Self Cite

View full text Add to dashboard Cite

show abstract

“…To differentiate objectives from inferior categorization is the goal of smooth visual categorization [22]. It is thought to be a very challenging assignment since smooth images naturally exhibit significant inter-class variations and tiny intra-class variability.…”

Section: Related Workmentioning

confidence: 99%

“…( 22) displays the percentage of positive situations that were expected to be negative but ended up being positive. (22) The contrast provided Table II highlights the FPR and FNR performance of the Hybrid DL-Attention Mechanism. The results highlight the potential of deep learning's attention mechanisms by showing how they may considerably improve the model's performance in tasks that call for striking a careful balance between reducing false alarms and missed detections.…”

Section: Training and Testingmentioning

confidence: 99%

Dynamic Object Detection Revolution: Deep Learning with Attention, Semantic Understanding, and Instance Segmentation for Real-World Precision

Shaik,

Banerjee,

Begum

et al. 2024

IJACSA

View full text Add to dashboard Cite

Semantic and instance segmentation are critical goals that span a wide range of applications, from autonomous driving to object recognition in different fields. The existing approaches have limitations, especially when it comes to the difficult task of identifying and detecting minute things in intricate real-world situations. This work presents a novel method that uses a hybrid deep learning architecture with the Python programming language to smoothly combine semantic and instance segmentation. The suggested approach takes care of the pressing necessity in challenging real-world settings for accurate localization and fine-grained object detection. By combining the strengths of a Convolutional Neural Network (CNN) with a Bidirectional Long Short-Term Memory Network (BiLSTM), the hybrid model effectively achieves semantic segmentation by using sequential input and spatial information. A parallel attention method is smoothly included into the segmentation process to further improve the model's capabilities and enable the recognition of important object attributes. This study highlights the difficulties caused by changing environmental elements, highlighting the need for precise object location and understanding in addition to the complexities of fine-grained object detection. The suggested approach has an outstanding accuracy rate of 99.66%, outperforming existing approaches by 25.22%. This significant increase highlights the benefits that the hybrid design has over individual techniques and shows how effective it is at resolving issues that arise in dynamic real-world circumstances. The research highlights the importance of attention processes in deep learning and demonstrates how they might improve the specificity and accuracy of object detection and localization in intricate realworld scenarios. The improved performance of the suggested methodology is with well-known techniques like RCNN, CNN, and DNN, reaffirming its status as a reliable means of developing object localization and recognition in difficult situations.

show abstract

“…Due to these practical issues, researchers have spent a lot of effort on recognition models based on a single image, but currently they can only achieve an accuracy of about 75% at best (Wang et al, 2023), with little improvement. It appears that the accuracy has reached a ceiling.…”

Section: Introductionmentioning

confidence: 99%

Pest recognition based on multi-image feature localization and adaptive filtering fusion

Chen,

Guo

et al. 2023

Front. Plant Sci.

View full text Add to dashboard Cite

Accurate recognition of pest categories is crucial for effective pest control. Due to issues such as the large variation in pest appearance, low data quality, and complex real-world environments, pest recognition poses challenges in practical applications. At present, many models have made great efforts on the real scene dataset IP102, but the highest recognition accuracy is only 75%. To improve pest recognition in practice, this paper proposes a multi-image fusion recognition method. Considering that farmers have easy access to data, the method performs fusion recognition on multiple images of the same pest instead of the conventional single image. Specifically, the method first uses convolutional neural network (CNN) to extract feature maps from these images. Then, an effective feature localization module (EFLM) captures the feature maps outputted by all blocks of the last convolutional stage of the CNN, marks the regions with large activation values as pest locations, and then integrates and crops them to obtain the localized features. Next, the adaptive filtering fusion module (AFFM) learns gate masks and selection masks for these features to eliminate interference from useless information, and uses the attention mechanism to select beneficial features for fusion. Finally, the classifier categorizes the fused features and the soft voting (SV) module integrates these results to obtain the final pest category. The principle of the model is activation value localization, feature filtering and fusion, and voting integration. The experimental results indicate that the proposed method can train high-performance feature extractors and classifiers, achieving recognition accuracy of 73.9%, 99.8%, and 99.7% on IP102, D0, and ETP, respectively, surpassing most single models. The results also show that thanks to the positive role of each module, the accuracy of multi-image fusion recognition reaches the state-of-the-art level of 96.1%, 100%, and 100% on IP102, D0, and ETP using 5, 2, and 2 images, respectively, which meets the requirements of practical applications. Additionally, we have developed a web application that applies our research findings in practice to assist farmers in reliable pest identification and drive the advancement of smart agriculture.

show abstract

AA-trans: Core attention aggregating transformer with information entropy selector for fine-grained visual classification

Cited by 20 publications

References 11 publications

PDDD-PreTrain: A Series of Commonly Used Pre-Trained Models Support Image-Based Plant Disease Diagnosis

PDDD-PreTrain: A Series of Commonly Used Pre-Trained Models Support Image-Based Plant Disease Diagnosis

Dynamic Object Detection Revolution: Deep Learning with Attention, Semantic Understanding, and Instance Segmentation for Real-World Precision

Pest recognition based on multi-image feature localization and adaptive filtering fusion

Contact Info

Product

Resources

About