“…At the last level, a 7-layers dilated convolutional network [8] is trained with kernel size of 3x3x3. From the first to last, the channel outputs and dilation coefficients are respectively 128, 128, 128, 96, 64, 32, 3 and 1, 2, 4,8, 16, 1, 1.…”
Section: Initial and Intermediate Field Estimatormentioning
Unsupervised learning-based medical image registration approaches have witnessed rapid development in recent years. We propose to revisit a commonly ignored while simple and well-established principle: recursive refinement of deformation vector fields across scales. We introduce a recursive refinement network (RRN) for unsupervised medical image registration, to extract multi-scale features, construct normalized local cost correlation volume and recursively refine volumetric deformation vector fields. RRN achieves state of the art performance for 3D registration of expiratory-inspiratory pairs of CT lung scans. On DirLab COPDGene dataset, RRN returns an average Target Registration Error (TRE) of 0.83 mm, which corresponds to a 13% error reduction from the best result presented in the leaderboard 4 . In addition to comparison with conventional methods, RRN leads to 89% error reduction compared to deep-learning-based peer approaches. Code: https://github.com/Novestars/Recursive Refinement Network.
“…At the last level, a 7-layers dilated convolutional network [8] is trained with kernel size of 3x3x3. From the first to last, the channel outputs and dilation coefficients are respectively 128, 128, 128, 96, 64, 32, 3 and 1, 2, 4,8, 16, 1, 1.…”
Section: Initial and Intermediate Field Estimatormentioning
Unsupervised learning-based medical image registration approaches have witnessed rapid development in recent years. We propose to revisit a commonly ignored while simple and well-established principle: recursive refinement of deformation vector fields across scales. We introduce a recursive refinement network (RRN) for unsupervised medical image registration, to extract multi-scale features, construct normalized local cost correlation volume and recursively refine volumetric deformation vector fields. RRN achieves state of the art performance for 3D registration of expiratory-inspiratory pairs of CT lung scans. On DirLab COPDGene dataset, RRN returns an average Target Registration Error (TRE) of 0.83 mm, which corresponds to a 13% error reduction from the best result presented in the leaderboard 4 . In addition to comparison with conventional methods, RRN leads to 89% error reduction compared to deep-learning-based peer approaches. Code: https://github.com/Novestars/Recursive Refinement Network.
“…Many models [15,14,17,7,18,5,1,3] have boosted the performance of semantic segmentation networks. These gains are mainly attributed to the use of pre-trained models, dilated convolutional layers [7,13] and fully convolutional architectures (DCNN) [19]. These works employ a range of strategies to tap contextual information, which fall into three major categories.…”
Section: Related Workmentioning
confidence: 99%
“…Context Aggregation Modules: These architectures place a special module on top of a pre-trained network that integrates context information at different distance scales. The development of fast and efficient algorithm for Dense-CRF [6] led to the development of numerous algorithms [7,8,9,10] incorporating it on top of the output belief map. Moreover, the joint training of CRF and CNN parameters was made possible by [11,12].…”
Section: Related Workmentioning
confidence: 99%
“…However, this is achieved at the cost of computationally expensive and memory exhaustive training/inference. The second related approach is to produce auxiliary context aggregation blocks [6,7,8,9,10,11,12,13,14,15] that contain features at different distance scales, and then merge these blocks to produce a final segmentation. This category includes many well-known techniques such as dense CRF [6] (conditional random fields) and spatial pyramid pooling [7].…”
Section: Introductionmentioning
confidence: 99%
“…Image classification networks are parameter heavy (44.5M parameters for ResNet-101), and segmentation methods built on top of these classification networks are often even more burdensome. For example, on top of the resnet-101 architecture, PSPNet [14] uses 22M additional parameters for context aggregation, while the ASPP and Cascade versions of the Deeplab network utilize 14.5M [7] and 40M [15] additional parameters, respectively.…”
Many imaging tasks require global information about all pixels in an image. Conventional bottom-up classification networks globalize information by decreasing resolution; features are pooled and downsampled into a single output. But for semantic segmentation and object detection tasks, a network must provide higher-resolution pixel-level outputs. To globalize information while preserving resolution, many researchers propose the inclusion of sophisticated auxiliary blocks, but these come at the cost of a considerable increase in network size and computational cost. This paper proposes stacked u-nets (SUNets), which iteratively combine features from different resolution scales while maintaining resolution. SUNets leverage the information globalization power of u-nets in a deeper network architectures that is capable of handling the complexity of natural images. SUNets perform extremely well on semantic segmentation tasks using a small number of parameters. The code is available at https://github.com/shahsohil/sunets.
Automatic and accurate instance segmentation of teeth can provide important support for computer-aided orthodontic work. Traditional methods for tooth segmentation studies often ignore the rich structural features of teeth. Capturing the complete and accurate geometry as well as morphological details of a single tooth remains a challenge for current tooth segmentation studies. In this article, a new tooth segmentation deeplearning network based on capturing dependencies and receptive field adjustment in cone beam computed tomography (CBCT) is proposed to achieve automatic and accurate instance segmentation of dental CBCT data. The method acquires coarse-level features of tooth and accurate tooth centroids in the first stage, and acquires the instance information and spatial position localization of the tooth. The encoding process in the second stage of the network introduces a guidance module for obtaining tooth geometry information based on a 3D self-attention mechanism to capture dependencies in CBCT. The proposed tooth feature integration module is based on multiscale fusion of dilated convolutions to capture tooth detailed information at multiple scales, and the network receptive field was adjusted. Extensive evaluation, ablation, and comparison experiments demonstrate that our method exhibits state-of-the-art segmentation performance and accurate instance segmentation results, reflecting their potential applicability in clinical medicine.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.