Due to the complexity of medical imaging techniques and the high heterogeneity of glioma surfaces, image segmentation of human gliomas is one of the most challenging tasks in medical image analysis. Current methods based on convolutional neural networks concentrate on feature extraction while ignoring the correlation between local and global. In this paper, we propose a residual mix transformer fusion net, namely RMTF-Net, for brain tumor segmentation. In the feature encoder, a residual mix transformer encoder including a mix transformer and a residual convolutional neural network (RCNN) is proposed. The mix transformer gives an overlapping patch embedding mechanism to cope with the loss of patch boundary information. Moreover, a parallel fusion strategy based on RCNN is utilized to obtain local–global balanced information. In the feature decoder, a global feature integration (GFI) module is applied, which can enrich the context with the global attention feature. Extensive experiments on brain tumor segmentation from LGG, BraTS2019 and BraTS2020 demonstrated that our proposed RMTF-Net is superior to existing state-of-art methods in subjective visual performance and objective evaluation.
Medical image segmentation has long been a compelling and fundamental problem in the realm of neuroscience. This is an extremely challenging task due to the intensely interfering irrelevant background information to segment the target. State-of-the-art methods fail to consider simultaneously addressing both long-range and short-range dependencies, and commonly emphasize the semantic information characterization capability while ignoring the geometric detail information implied in the shallow feature maps resulting in the dropping of crucial features. To tackle the above problem, we propose a Global-Local representation learning net for medical image segmentation, namely GL-Segnet. In the Feature encoder, we utilize the Multi-Scale Convolution (MSC) and Multi-Scale Pooling (MSP) modules to encode the global semantic representation information at the shallow level of the network, and multi-scale feature fusion operations are applied to enrich local geometric detail information in a cross-level manner. Beyond that, we adopt a global semantic feature extraction module to perform filtering of irrelevant background information. In Attention-enhancing Decoder, we use the Attention-based feature decoding module to refine the multi-scale fused feature information, which provides effective cues for attention decoding. We exploit the structural similarity between images and the edge gradient information to propose a hybrid loss to improve the segmentation accuracy of the model. Extensive experiments on medical image segmentation from Glas, ISIC, Brain Tumors and SIIM-ACR demonstrated that our GL-Segnet is superior to existing state-of-art methods in subjective visual performance and objective evaluation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.