Fine-grained single-label classification tasks aim to distinguish highly similar categories but often overlook inter-category relationships. Hierarchical multi-granularity visual classification strives to categorize image labels at various hierarchy levels, offering optimize label selection for people. This paper addresses the hierarchical multi-granularity classification problem from two perspectives: (1) effective utilization of labels at different levels and (2) efficient learning to distinguish multi-granularity visual features. To tackle these issues,
we propose a novel multi-granularity hypergraph enhanced Hierarchical Neural Network (HNN) framework, seamlessly integrating swin transformers and hypergraph neural networks for handling visual classification tasks.
Firstly, we employ swin transformer as a Image Hierarchical Feature Learning (IHFL) module to capture hierarchical features. Secondly, a Feature Reassemble (FR) module is applied to rearrange features at different hierarchy levels, creating a spectrum of features from coarse to fine-grained. Thirdly, to unveil the correlation between features at different granularity, we propose a Feature Relationship Mining (FRM) module. Within this module, a learnable hypergraph modeling method is introduced to construct coarse to fine-grained hypergraph structures. Simultaneously, multi-granularity hypergraph neural networks are employed to explore grouping relationships feature in different granularity, enhancing semantic feature representation learning within the hypergraph space. Finally, we adopt a Multi-Granularity Classifier (MGC) to predict hierarchical label probabilities. Experimental results demonstrate that HNN outperforms other state-of-the-art classification methods across three multi-granularity datasets.