DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Chen, Liang-Chieh; Papandreou, George; Kokkinos, Iasonas; Murphy, Kevin; Yuille, Alan

doi:10.48550/arxiv.1606.00915

Cited by 517 publications

(959 citation statements)

References 52 publications

Supporting

Mentioning

933

Contrasting

Unclassified

Order By: Relevance

“…At the last level, a 7-layers dilated convolutional network [8] is trained with kernel size of 3x3x3. From the first to last, the channel outputs and dilation coefficients are respectively 128, 128, 128, 96, 64, 32, 3 and 1, 2, 4,8, 16, 1, 1.…”

Section: Initial and Intermediate Field Estimatormentioning

confidence: 99%

Recursive Refinement Network for Deformable Lung Registration between Exhale and Inhale CT Scans

He¹,

Guo²,

Zhang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Unsupervised learning-based medical image registration approaches have witnessed rapid development in recent years. We propose to revisit a commonly ignored while simple and well-established principle: recursive refinement of deformation vector fields across scales. We introduce a recursive refinement network (RRN) for unsupervised medical image registration, to extract multi-scale features, construct normalized local cost correlation volume and recursively refine volumetric deformation vector fields. RRN achieves state of the art performance for 3D registration of expiratory-inspiratory pairs of CT lung scans. On DirLab COPDGene dataset, RRN returns an average Target Registration Error (TRE) of 0.83 mm, which corresponds to a 13% error reduction from the best result presented in the leaderboard 4 . In addition to comparison with conventional methods, RRN leads to 89% error reduction compared to deep-learning-based peer approaches. Code: https://github.com/Novestars/Recursive Refinement Network.

show abstract

Section: Initial and Intermediate Field Estimatormentioning

confidence: 99%

Recursive Refinement Network for Deformable Lung Registration between Exhale and Inhale CT Scans

He¹,

Guo²,

Zhang³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Many models [15,14,17,7,18,5,1,3] have boosted the performance of semantic segmentation networks. These gains are mainly attributed to the use of pre-trained models, dilated convolutional layers [7,13] and fully convolutional architectures (DCNN) [19]. These works employ a range of strategies to tap contextual information, which fall into three major categories.…”

Section: Related Workmentioning

confidence: 99%

“…Context Aggregation Modules: These architectures place a special module on top of a pre-trained network that integrates context information at different distance scales. The development of fast and efficient algorithm for Dense-CRF [6] led to the development of numerous algorithms [7,8,9,10] incorporating it on top of the output belief map. Moreover, the joint training of CRF and CNN parameters was made possible by [11,12].…”

Section: Related Workmentioning

confidence: 99%

“…However, this is achieved at the cost of computationally expensive and memory exhaustive training/inference. The second related approach is to produce auxiliary context aggregation blocks [6,7,8,9,10,11,12,13,14,15] that contain features at different distance scales, and then merge these blocks to produce a final segmentation. This category includes many well-known techniques such as dense CRF [6] (conditional random fields) and spatial pyramid pooling [7].…”

Section: Introductionmentioning

confidence: 99%

“…Image classification networks are parameter heavy (44.5M parameters for ResNet-101), and segmentation methods built on top of these classification networks are often even more burdensome. For example, on top of the resnet-101 architecture, PSPNet [14] uses 22M additional parameters for context aggregation, while the ASPP and Cascade versions of the Deeplab network utilize 14.5M [7] and 40M [15] additional parameters, respectively.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Stacked U-Nets: A No-Frills Approach to Natural Image Segmentation

Shah¹,

Ghosh²,

Davis³

et al. 2018

Preprint

View full text Add to dashboard Cite

Many imaging tasks require global information about all pixels in an image. Conventional bottom-up classification networks globalize information by decreasing resolution; features are pooled and downsampled into a single output. But for semantic segmentation and object detection tasks, a network must provide higher-resolution pixel-level outputs. To globalize information while preserving resolution, many researchers propose the inclusion of sophisticated auxiliary blocks, but these come at the cost of a considerable increase in network size and computational cost. This paper proposes stacked u-nets (SUNets), which iteratively combine features from different resolution scales while maintaining resolution. SUNets leverage the information globalization power of u-nets in a deeper network architectures that is capable of handling the complexity of natural images. SUNets perform extremely well on semantic segmentation tasks using a small number of parameters. The code is available at https://github.com/shahsohil/sunets.

show abstract

Tooth instance segmentation based on capturing dependencies and receptive field adjustment in cone beam computed tomography

Dou

Gao

Mao

et al. 2022

Computer Animation & Virtual

View full text Add to dashboard Cite

Automatic and accurate instance segmentation of teeth can provide important support for computer-aided orthodontic work. Traditional methods for tooth segmentation studies often ignore the rich structural features of teeth. Capturing the complete and accurate geometry as well as morphological details of a single tooth remains a challenge for current tooth segmentation studies. In this article, a new tooth segmentation deeplearning network based on capturing dependencies and receptive field adjustment in cone beam computed tomography (CBCT) is proposed to achieve automatic and accurate instance segmentation of dental CBCT data. The method acquires coarse-level features of tooth and accurate tooth centroids in the first stage, and acquires the instance information and spatial position localization of the tooth. The encoding process in the second stage of the network introduces a guidance module for obtaining tooth geometry information based on a 3D self-attention mechanism to capture dependencies in CBCT. The proposed tooth feature integration module is based on multiscale fusion of dilated convolutions to capture tooth detailed information at multiple scales, and the network receptive field was adjusted. Extensive evaluation, ablation, and comparison experiments demonstrate that our method exhibits state-of-the-art segmentation performance and accurate instance segmentation results, reflecting their potential applicability in clinical medicine.

show abstract

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Cited by 517 publications

References 52 publications

Recursive Refinement Network for Deformable Lung Registration between Exhale and Inhale CT Scans

Recursive Refinement Network for Deformable Lung Registration between Exhale and Inhale CT Scans

Stacked U-Nets: A No-Frills Approach to Natural Image Segmentation

Tooth instance segmentation based on capturing dependencies and receptive field adjustment in cone beam computed tomography

Contact Info

Product

Resources

About