We develop a fine-grained image classifier using a general deep convolutional neural network (DCNN). We improve the fine-grained image classification accuracy of a DCNN model from the following two aspects. First, to better model the h-level hierarchical label structure of the fine-grained image classes contained in the given training data set, we introduce h fully connected (fc) layers to replace the top fc layer of a given DCNN model and train them with the cascaded softmax loss. Second, we propose a novel loss function, namely, generalized large-margin (GLM) loss, to make the given DCNN model explicitly explore the hierarchical label structure and the similarity regularities of the fine-grained image classes. The GLM loss explicitly not only reduces between-class similarity and within-class variance of the learned features by DCNN models but also makes the subclasses belonging to the same coarse class be more similar to each other than those belonging to different coarse classes in the feature space. Moreover, the proposed fine-grained image classification framework is independent and can be applied to any DCNN structures. Comprehensive experimental evaluations of several general DCNN models (AlexNet, GoogLeNet, and VGG) using three benchmark data sets (Stanford car, fine-grained visual classification-aircraft, and CUB-200-2011) for the fine-grained image classification task demonstrate the effectiveness of our method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.