Water surface object detection is one of the most significant tasks in autonomous driving and water surface vision applications. To date, existing public large-scale datasets collected from websites do not focus on specific scenarios. As a characteristic of these datasets, the quantity of the images and instances is also still at a low level. To accelerate the development of water surface autonomous driving, this paper proposes a large-scale, high-quality annotated benchmark dataset, named Water Surface Object Detection Dataset (WSODD), to benchmark different water surface object detection algorithms. The proposed dataset consists of 7,467 water surface images in different water environments, climate conditions, and shooting times. In addition, the dataset comprises a total of 14 common object categories and 21,911 instances. Simultaneously, more specific scenarios are focused on in WSODD. In order to find a straightforward architecture to provide good performance on WSODD, a new object detector, named CRB-Net, is proposed to serve as a baseline. In experiments, CRB-Net was compared with 16 state-of-the-art object detection methods and outperformed all of them in terms of detection precision. In this paper, we further discuss the effect of the dataset diversity (e.g., instance size, lighting conditions), training set size, and dataset details (e.g., method of categorization). Cross-dataset validation shows that WSODD significantly outperforms other relevant datasets and that the adaptability of CRB-Net is excellent.
For multimodal medical image fusion problems, most of the existing fusion approaches are based on pixel-level. However, the pixel-based fusion method tends to lose local and spatial information as the relationships between pixels are not considered appropriately, which has much influence on the quality of the fusion results. To address this issue, a region-based multimodal medical image fusion framework is proposed based on superpixel segmentation and a post-processing optimization method in this paper. In this framework, the averaging image of the source medical images are firstly obtained by a weighted averaging method. To effectively obtain homogeneous regions and preserve the complete information of image details, the fast linear spectral clustering(LSC) superpixel algorithm is carried out to segment the averaging image and get superpixel labels. For each region of the medical images, log-gabor filter(LGF) and sum modified laplacian(SML) are adopted to extract texture feature and contrast feature for the measurement of region importance. The most important regions are selected and the decision map is generated by comparison. Moreover, to get a more accurate decision map, a new post-processing optimized method based on genetic algorithm(GA) is given. A weighted strategy is applied to the extracted features and the weighting factor can be adaptively adjusted by GA. The effectiveness of the proposed fusion method is validated by conducting experiments on eight pairs of medical images from diverse modalities. In addition, seven other mainstream medical image fusion methods are adopted for comparing the performance of fusion. Experimental results in terms of qualitative and quantitative evaluation demonstrate that the proposed method can achieve stateof-the-art performance for multimodal medical image fusion problems.INDEX TERMS Multimodal medical image fusion, superpixel segmentation, genetic algorithm, log-gabor filter, sum modified laplacian.
To achieve better performance in multifocus image fusion problems, a new regional approach based on superpixels and superpixel-based mean filtering is proposed in this paper. First, a fast and effective segmentation method is adopted to generate the superpixels over a clarity-enhanced average image. By averaging the clarity information in each superpixel, we make the initial decision map of fusion by regionally selecting sharper superpixels in different source images. Then a novel superpixel-based mean filtering technique is introduced to make full use of spatial consistency in images and the final post-processed decision map is produced. The fused image is constructed by selecting pixels from different source images according to the final decision map. Experimental results demonstrate the proposed method's competitive performance in comparison with state-of-the-art multifocus image fusion approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.