Volumetric memory network for interactive medical image segmentation

Zhou, Tianfei; Li, Liulei; Bredell, Gustav; Li, Jianwu; Konukoğlu, Ender

doi:10.1016/j.media.2022.102599

Cited by 57 publications

(16 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…ResUNet++ also uses a conditional random field (CRF) and test time augmentation (TTA) for better prediction efficiency. Along these lines, many studies have been conducted on medical image segmentation based on deep learning [ 40 , 41 , 42 , 43 , 44 , 45 ]. In the present work, the proposed structure is designed to improve a U-Net model by adding residual modules.…”

Section: Related Workmentioning

confidence: 99%

Retinal Vascular Image Segmentation Using Improved UNet Based on Residual Module

et al. 2023

View full text Add to dashboard Cite

In recent years, deep learning technology for clinical diagnosis has progressed considerably, and the value of medical imaging continues to increase. In the past, clinicians evaluated medical images according to their individual expertise. In contrast, the application of artificial intelligence technology for automatic analysis and diagnostic assistance to support clinicians in evaluating medical information more efficiently has become an important trend. In this study, we propose a machine learning architecture designed to segment images of retinal blood vessels based on an improved U-Net neural network model. The proposed model incorporates a residual module to extract features more effectively, and includes a full-scale skip connection to combine low level details with high-level features at different scales. The results of an experimental evaluation show that the model was able to segment images of retinal vessels accurately. The proposed method also outperformed several existing models on the benchmark datasets DRIVE and ROSE, including U-Net, ResUNet, U-Net3+, ResUNet++, and CaraNet.

show abstract

Section: Related Workmentioning

confidence: 99%

Retinal Vascular Image Segmentation Using Improved UNet Based on Residual Module

et al. 2023

View full text Add to dashboard Cite

show abstract

“…In the image segmentation field, a lot of emerging technologies have been successively introduced for improvement in recent years, such as graph convolution [33], prototypebased classification [34], and the memory-augmented network [35], while the most commonly used network architecture is still the Encoder-Decoder structure. The encoder progressively enlarges receptive fields to capture sufficient object semantic information, and the decoder is used to recover the spatial size and detail of deeply encoded features for pixel-level predictions.…”

Section: Encoder-decoder Segmentation Modelmentioning

confidence: 99%

XANet: An Efficient Remote Sensing Image Segmentation Model Using Element-Wise Attention Enhancement and Multi-Scale Attention Fusion

et al. 2022

View full text Add to dashboard Cite

Massive and diverse remote sensing data provide opportunities for data-driven tasks in the real world, but also present challenges in terms of data processing and analysis, especially pixel-level image interpretation. However, the existing shallow-learning and deep-learning segmentation methods, bounded by their technical bottlenecks, cannot properly balance accuracy and efficiency, and are thus hardly scalable to the practice scenarios of remote sensing in a successful way. Instead of following the time-consuming deep stacks of local operations as most state-of-the-art segmentation networks, we propose a novel segmentation model with the encoder–decoder structure, dubbed XANet, which leverages the more computationally economical attention mechanism to boost performance. Two novel attention modules in XANet are proposed to strengthen the encoder and decoder, respectively, namely the Attention Recalibration Module (ARM) and Attention Fusion Module (AFM). Unlike current attention modules, which only focus on elevating the feature representation power, and regard the spatial and channel enhancement of a feature map as two independent steps, ARM gathers element-wise semantic descriptors coupling spatial and channel information to directly generate a 3D attention map for feature enhancement, and AFM innovatively utilizes the cross-attention mechanism for the sufficient spatial and channel fusion of multi-scale features. Extensive experiments were conducted on ISPRS and GID datasets to comprehensively analyze XANet and explore the effects of ARM and AFM. Furthermore, the results demonstrate that XANet surpasses other state-of-the-art segmentation methods in both model performance and efficiency, as ARM yields a superior improvement versus existing attention modules with a competitive computational overhead, and AFM achieves the complementary advantages of multi-level features under the sufficient consideration of efficiency.

show abstract

“…Long-range dependency modeling has been extensively studied in many fields, such as video segmentation [23] and image segmentation [24]. The self-attention module is one of the first to model pairwise long-range relations.…”

Section: Pairwise Long-range Dependency Modelingmentioning

confidence: 99%

“…In particular, in standard convolution, the dilation rate r = 1. As shown in Figure 3, we apply multi-scale dilation convolution with four branches with rates (r = 6,12,18,24). Each of them has padding = r and stride = 1 to maintain the resolution of the input feature map.…”

Section: Local Context Refinement Modulementioning

confidence: 99%

Single-Shot Global and Local Context Refinement Neural Network for Head Detection

Hu¹,

Yang²

2022

Future Internet

View full text Add to dashboard Cite

Head detection is a fundamental task, and it plays an important role in many head-related problems. The difficulty in creating the local and global context in the face of significant lighting, orientation, and occlusion uncertainty, among other factors, still makes this task a remarkable challenge. To tackle these problems, this paper proposes an effective detector, the Context Refinement Network (CRN), that captures not only the refined global context but also the enhanced local context. We use simplified non-local (SNL) blocks at hierarchical features, which can successfully establish long-range dependencies between heads to improve the capability of building the global context. We suggest a multi-scale dilated convolutional module for the local context surrounding heads that extracts local context from various head characteristics. In comparison to other models, our method outperforms them on the Brainwash and the HollywoodHeads datasets.

show abstract

Volumetric memory network for interactive medical image segmentation

Cited by 57 publications

References 34 publications

Retinal Vascular Image Segmentation Using Improved UNet Based on Residual Module

Retinal Vascular Image Segmentation Using Improved UNet Based on Residual Module

XANet: An Efficient Remote Sensing Image Segmentation Model Using Element-Wise Attention Enhancement and Multi-Scale Attention Fusion

Single-Shot Global and Local Context Refinement Neural Network for Head Detection

Contact Info

Product

Resources

About