Water surface object detection is one of the most significant tasks in autonomous driving and water surface vision applications. To date, existing public large-scale datasets collected from websites do not focus on specific scenarios. As a characteristic of these datasets, the quantity of the images and instances is also still at a low level. To accelerate the development of water surface autonomous driving, this paper proposes a large-scale, high-quality annotated benchmark dataset, named Water Surface Object Detection Dataset (WSODD), to benchmark different water surface object detection algorithms. The proposed dataset consists of 7,467 water surface images in different water environments, climate conditions, and shooting times. In addition, the dataset comprises a total of 14 common object categories and 21,911 instances. Simultaneously, more specific scenarios are focused on in WSODD. In order to find a straightforward architecture to provide good performance on WSODD, a new object detector, named CRB-Net, is proposed to serve as a baseline. In experiments, CRB-Net was compared with 16 state-of-the-art object detection methods and outperformed all of them in terms of detection precision. In this paper, we further discuss the effect of the dataset diversity (e.g., instance size, lighting conditions), training set size, and dataset details (e.g., method of categorization). Cross-dataset validation shows that WSODD significantly outperforms other relevant datasets and that the adaptability of CRB-Net is excellent.
Epilepsy is a chronic brain disease that causes persistent and severe damage to the physical and mental health of patients. Daily effective prediction of epileptic seizures is crucial for epilepsy patients especially those with refractory epilepsy. At present, a large number of deep learning algorithms such as Convolutional Neural Networks and Recurrent Neural Networks have been used to predict epileptic seizures and have obtained better performance than traditional machine learning methods. However, these methods usually transform the Electroencephalogram (EEG) signal into a Euclidean grid structure. The conversion suffers from loss of adjacent spatial information, which results in deep learning models requiring more storage and computational consumption in the process of information fusion after information extraction. This study proposes a general Graph Convolutional Networks (GCN) model architecture for predicting seizures to solve the problem of oversized seizure prediction models based on exploring the graph structure of EEG signals. As a graph classification task, the network architecture includes graph convolution layers that extract node features with one-hop neighbors, pooling layers that summarize abstract node features; and fully connected layers that implement classification, resulting in superior prediction performance and smaller network size. The experiment shows that the model has an average sensitivity of 96.51%, an average AUC of 0.92, and a model size of 15.5 k on 18 patients in the CHB-MIT scalp EEG dataset. Compared with traditional deep learning methods, which require a large number of parameters and computational effort and are demanding in terms of storage space and energy consumption, this method is more suitable for implementation on compact, low-power wearable devices as a standard process for building a generic low-consumption graph network model on similar biomedical signals. Furthermore, the edge features of graphs can be used to make a preliminary determination of locations and types of discharge, making it more clinically interpretable.
For multimodal medical image fusion problems, most of the existing fusion approaches are based on pixel-level. However, the pixel-based fusion method tends to lose local and spatial information as the relationships between pixels are not considered appropriately, which has much influence on the quality of the fusion results. To address this issue, a region-based multimodal medical image fusion framework is proposed based on superpixel segmentation and a post-processing optimization method in this paper. In this framework, the averaging image of the source medical images are firstly obtained by a weighted averaging method. To effectively obtain homogeneous regions and preserve the complete information of image details, the fast linear spectral clustering(LSC) superpixel algorithm is carried out to segment the averaging image and get superpixel labels. For each region of the medical images, log-gabor filter(LGF) and sum modified laplacian(SML) are adopted to extract texture feature and contrast feature for the measurement of region importance. The most important regions are selected and the decision map is generated by comparison. Moreover, to get a more accurate decision map, a new post-processing optimized method based on genetic algorithm(GA) is given. A weighted strategy is applied to the extracted features and the weighting factor can be adaptively adjusted by GA. The effectiveness of the proposed fusion method is validated by conducting experiments on eight pairs of medical images from diverse modalities. In addition, seven other mainstream medical image fusion methods are adopted for comparing the performance of fusion. Experimental results in terms of qualitative and quantitative evaluation demonstrate that the proposed method can achieve stateof-the-art performance for multimodal medical image fusion problems.INDEX TERMS Multimodal medical image fusion, superpixel segmentation, genetic algorithm, log-gabor filter, sum modified laplacian.
To achieve better performance in multifocus image fusion problems, a new regional approach based on superpixels and superpixel-based mean filtering is proposed in this paper. First, a fast and effective segmentation method is adopted to generate the superpixels over a clarity-enhanced average image. By averaging the clarity information in each superpixel, we make the initial decision map of fusion by regionally selecting sharper superpixels in different source images. Then a novel superpixel-based mean filtering technique is introduced to make full use of spatial consistency in images and the final post-processed decision map is produced. The fused image is constructed by selecting pixels from different source images according to the final decision map. Experimental results demonstrate the proposed method's competitive performance in comparison with state-of-the-art multifocus image fusion approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.