Inshore ship detection plays an important role in many civilian and military applications. The complex land environment and the diversity of target sizes and distributions make it still challenging for us to obtain accurate detection results. In order to achieve precise localization and suppress false alarms, in this paper, we propose a framework which integrates a multi-scale feature fusion network, rotation region proposal network and contextual pooling together. Specifically, in order to describe ships of various sizes, different convolutional layers are fused to obtain multi-scale features based on the baseline feature extraction network. Then, for the purpose of accurate target localization and arbitrary-oriented ship detection, a rotation region proposal network and skew non-maximum suppression are employed. Finally, on account of the disadvantages that the employment of a rotation bounding box usually causes more false alarms, we implement inclined context feature pooling on rotation region proposals. A dataset including port images collected from Google Earth and a public ship dataset HRSC2016 are employed in our experiments to test the proposed method. Experimental results of model analysis validate the contribution of each module mentioned above, and contrast results show that our proposed pipeline is able to achieve state-of-the-art performance of arbitrary-oriented inshore ship detection.
Semantic segmentation is a fundamental task in remote sensing image interpretation, which aims to assign a semantic label for every pixel in the given image. Accurate semantic segmentation is still challenging due to the complex distributions of various ground objects. With the development of deep learning, a series of segmentation networks represented by fully convolutional network (FCN) has made remarkable progress on this problem, but the segmentation accuracy is still far from expectations. This paper focuses on the importance of class-specific features of different land cover objects, and presents a novel end-to-end class-wise processing framework for segmentation. The proposed class-wise FCN (C-FCN) is shaped in the form of an encoder-decoder structure with skip-connections, in which the encoder is shared to produce general features for all categories and the decoder is class-wise to process class-specific features. To be detailed, class-wise transition (CT), class-wise up-sampling (CU), class-wise supervision (CS), and class-wise classification (CC) modules are designed to achieve the class-wise transfer, recover the resolution of class-wise feature maps, bridge the encoder and modified decoder, and implement class-wise classifications, respectively. Class-wise and group convolutions are adopted in the architecture with regard to the control of parameter numbers. The method is tested on the public ISPRS 2D semantic labeling benchmark datasets. Experimental results show that the proposed C-FCN significantly improves the segmentation performances compared with many state-of-the-art FCN-based networks, revealing its potentials on accurate segmentation of complex remote sensing images.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.