Object detection in aerial images is a fundamental yet challenging task in remote sensing field. As most objects in aerial images are in arbitrary orientations, oriented bounding boxes (OBBs) have a great superiority compared with traditional horizontal bounding boxes (HBBs). However, the regression-based OBB detection methods always suffer from ambiguity in the definition of learning targets, which will decrease the detection accuracy. In this paper, we provide a comprehensive analysis of OBB representations and cast the OBB regression as a pixel-level classification problem, which can largely eliminate the ambiguity. The predicted masks are subsequently used to generate OBBs. To handle huge scale changes of objects in aerial images, an Inception Lateral Connection Network (ILCN) is utilized to enhance the Feature Pyramid Network (FPN). Furthermore, a Semantic Attention Network (SAN) is adopted to provide the semantic feature, which can help distinguish the object of interest from the cluttered background effectively. Empirical studies show that the entire method is simple yet efficient. Experimental results on two widely used datasets, i.e., DOTA and HRSC2016, demonstrate that the proposed method outperforms state-of-the-art methods.
Fine-grained ship detection is an important task in high-resolution satellite remote sensing applications. However, large aspect ratios and severe category imbalance make finegrained ship detection a challenging problem. Current methods usually extract square-like features that do not work well to detect ships with large aspect ratios, and the misalignments in feature representation will severely degrade the performance of ship localization and classification. To tackle this, we propose a shape-aware feature learning method to mitigate the misalignments during feature extraction. Furthermore, for the issue of category imbalance, we design a shape-aware instance switching to balance the quantity distribution of ships in different categories, which can greatly improve the network's learning ability for rare instances. To verify the effectiveness of the proposed method, we contribute a multi-category ship detection dataset (MCSD) that contains 4000 images carefully labeled with oriented bounding boxes, including 16 types of ship objects and nearly 18,000 instances. We conduct experiments on our MCSD and ShipRSImageNet, and extensive experimental results demonstrate the superiority of the proposed method over several state-of-the-art methods. Dataset and code will be available at https://guobo98.github.io/shape-aware-shipdet.
The object detection task is usually affected by complex backgrounds. In this paper, a new image object detection method is proposed, which can perform multi-feature selection on multi-scale feature maps. By this method, a bidirectional multi-scale feature fusion network was designed to fuse semantic features and shallow features to improve the detection effects of small objects in complex backgrounds. When the shallow features are transferred to the top layer, a bottom-up path is added to reduce the number of network layers experienced by the feature fusion network, reducing the loss of shallow features. In addition, a multi-feature selection module based on the attention mechanism is used to minimize the interference of useless information in subsequent classification and regression, allowing the network to adaptively focus on appropriate information for classification or regression to improve detection accuracy. Because the traditional five-parameter regression method has severe boundary problems when predicting objects with large aspect ratios, the proposed network treats angle prediction as a classification task. The experimental results on the DOTA dataset, the self-made DOTA-GF dataset and the HRSC 2016 dataset show that, compared with several popular object detection algorithms, the proposed method has certain advantages in detection accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.