3D Anisotropic Hybrid Network: Transferring Convolutional Features from 2D Images to 3D Anisotropic Volumes

Liu, Siqi; Xu, Daguang; Zhou, S. Kevin; Mertelmeier, Thomas; Wicklein, Julia; Jerebko, Anna K.; Grbić, Saša; Pauly, Olivier; Cai, Weidong; Comanicìu, Dorin

doi:10.1007/978-3-030-00934-2_94

Cited by 137 publications

(115 citation statements)

References 18 publications

Supporting

Mentioning

111

Contrasting

Unclassified

Order By: Relevance

“…The first two convolutional layers adopt a kernel size 7×7×1 with stride [2, 2, 1] and 1×1×3 with stride [1, 1, 1]. The overall network architecture is effectively verified by [13] while we add the searching process for color blocks to choose between 2D, 3D, and P3D.…”

Section: Methodsmentioning

confidence: 99%

“…3, we define 3 Decoder cells, composed of the 2D Decoder D 0 , 3D Decoder D 1 , and P3D Decoder D 2 . The Decoder cell is defined as dense blocks, which shows powerful representation ability in [8,13]. The input of the b-th Decoder cell is denoted as x b while the output as x b+1 , which is the input of the (b + 1)-th Decoder cell.…”

Section: Decoder Search Spacementioning

confidence: 99%

“…However, the demanding computation and high GPU consumption of 3D convolutions limit the depth of neural networks and input volume size, which impedes the massive application of 3D convolutions. Recently, the Pseudo-3D (P3D) [17] was introduced to replace 3D convolution k×k×k with two convolutions, i.e., k×k×1 followed by 1×1×k, which can reduce the number of parameters and show good learning ability in [13,25] on anisotropic medical images. However, all the aforementioned existing works choose the network structure empirically, which often impose explicit constraints, i.e., either 2D, 3D or P3D convolutions only, or 2D and 3D convolutions are separate from each other.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation

Zhu

Liu

Yang

et al. 2019

2019 International Conference on 3D Vision (3DV)

Self Cite

103

View full text Add to dashboard Cite

Deep learning algorithms, in particular 2D and 3D fully convolutional neural networks (FCNs), have rapidly become the mainstream methodology for volumetric medical image segmentation. However, 2D convolutions cannot fully leverage the rich spatial information along the third axis, while 3D convolutions suffer from the demanding computation and high GPU memory consumption. In this paper, we propose to automatically search the network architecture tailoring to volumetric medical image segmentation problem. Concretely, we formulate the structure learning as differentiable neural architecture search, and let the network itself choose between 2D, 3D or Pseudo-3D (P3D) convolutions at each layer. We evaluate our method on 3 public datasets, i.e., the NIH Pancreas dataset, the Lung and Pancreas dataset from the Medical Segmentation Decathlon (MSD) Challenge. Our method, named V-NAS, consistently outperforms other state-of-the-arts on the segmentation tasks of both normal organ (NIH Pancreas) and abnormal organs (MSD Lung tumors and MSD Pancreas tumors), which shows the power of chosen architecture. Moreover, the searched architecture on one dataset can be well generalized to other datasets, which demonstrates the robustness and practical use of our proposed method.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Decoder Search Spacementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation

Zhu

Liu

Yang

et al. 2019

2019 International Conference on 3D Vision (3DV)

Self Cite

103

View full text Add to dashboard Cite

show abstract

“…Given all pairs of images X and pseudo labelsŶ , we re-sample them to 1 mm 3 isotropic resolution and train an ensemble E of n fully convolutional neural networks to segment the given foreground classes, with P (X)=E(X) standing for the softmax output probability maps for the different classes in the image. Our network architectures follow the encoder-decoder network proposed in [15], named AH-Net, and [5] based on the popular 3D U-Net architecture [3] with residual connections [16], named SegResNet. For training and implementing these neural networks, we used the NVIDIA Clara Train SDK 1 and NVIDIA Tesla V100 GPU with 16 GB memory.…”

Section: Deep Learning Based Segmentation With Noisy Labelsmentioning

confidence: 99%

“…For training and implementing these neural networks, we used the NVIDIA Clara Train SDK 1 and NVIDIA Tesla V100 GPU with 16 GB memory. As in [15], we initialize AH-Net from ImageNet pretrained weights using a ResNet-18 encoder branch, utilizing anisotropic (3×3×1) kernels in the encoder path in order to make use of pretrained weights from 2D computer vision tasks. While the initial weights are learned from 2D, all convolutions are still applied in a full 3D fashion throughout the network, allowing it to efficiently learn 3D features from the image.…”

Section: Deep Learning Based Segmentation With Noisy Labelsmentioning

confidence: 99%

Cardiac Segmentation of LGE MRI with Noisy Labels

Roth¹,

Zhu²,

Yang³

et al. 2020

Statistical Atlases and Computational Models of the Heart. Multi-Sequence CMR Segmentation, CRT-EPiggy and LV Full Quantificati

Self Cite

View full text Add to dashboard Cite

In this work, we attempt the segmentation of cardiac structures in late gadolinium-enhanced (LGE) magnetic resonance images (MRI) using only minimal supervision in a two-step approach. In the first step, we register a small set of five LGE cardiac magnetic resonance (CMR) images with ground truth labels to a set of 40 target LGE CMR images without annotation. Each manually annotated ground truth provides labels of the myocardium and the left ventricle (LV) and right ventricle (RV) cavities, which are used as atlases. After multi-atlas label fusion by majority voting, we possess noisy labels for each of the targeted LGE images. A second set of manual labels exists for 30 patients of the target LGE CMR images, but are annotated on different MRI sequences (bSSFP and T2-weighted). Again, we use multi-atlas label fusion with a consistency constraint to further refine our noisy labels if additional annotations in other modalities are available for a given patient. In the second step, we train a deep convolutional network for semantic segmentation on the target data while using data augmentation techniques to avoid over-fitting to the noisy labels. After inference and simple post-processing, we achieve our final segmentation for the targeted LGE CMR images, resulting in an average Dice of 0.890, 0.780, and 0.844 for LV cavity, LV myocardium, and RV cavity, respectively.

show abstract