The purpose of fine-grained image classification is to distinguish subcategories belonging to the same basic-level category, for example, two hundred subcategories belonging to birds. It has been a challenging topic in the field of computer vision in recent years due to the small inter-class variance among different subcategories (e.g., color and texture) and the large intra-class variance in the same subcategory (e.g., pose and viewpoint). In this paper, we propose a Compound Model Scaling with Efficient Attention (CMSEA) for fine-grained image classification, which carefully balances the various dimensions of width, depth, and image resolution in model scaling. Furthermore, the proposed method utilizes an additional computational low attention module to efficiently learn subtler features from discriminative regions. In addition, regularization and data augmentation were employed to improve accuracy in the training. Extensive experiments demonstrate that CMSEA achieves 90.63%, 94.51%, and 95.19% accuracy on CUB-200-2011, FGVC-Aircraft, and Stanford Cars datasets, respectively. In particular, CMSEA on CUB-200-2011 obtains 2.3% higher accuracy with 18% fewer network parameters than the original approach. Consequently, our method has better accuracy and parameter efficiency compared to most existing methods.
It is an essential and challenging task to accurately identify unknown plants from images without professional knowledge due to the large intra-class variance and small inter-class variance. Aiming at the problem of low accuracy and model complexity, a lightweight plant species recognition algorithm using EfficientNet with Efficient Channel Attention (ECAENet) is proposed. The proposed approach is based on EfficientNet, which used neural architecture search to gain a baseline network and uniformly scales all dimensions of depth, width, and resolution using a compound coefficient. To overcome Squeeze-and-Excitation block complexity, the proposed method replaces all the two fully-connected layers in the channel attention modules with a fast one-dimensional convolution with an adaptive kernel, which avoids dimensionality reduction and effectively learns the discriminative features. The experimental results demonstrate that our ECAENet achieves 99.56%, 99.75%, 98.40%, and 93.79% accuracy on the well-known Swedish Leaf, Flavia Leaf, Oxford Flowers, and Leafsnap datasets, respectively. In particular, our method achieves 3.6x fewer network parameters and 8.4x FLOPs than others with similar accuracy. Therefore, our method achieves better recognition performance compared to most of the existing plant recognition methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.