Encoder-decoder networks are state-of-the-art approaches to biomedical image segmentation, but have two problems: i.e., the widely used pooling operations may discard spatial information, and therefore low-level semantics are lost. Feature fusion methods can mitigate these problems but feature maps of different scales cannot be easily fused because down-and upsampling change the spatial resolution of feature map. To address these issues, we propose INet, which enlarges receptive fields by increasing the kernel sizes of convolutional layers in steps (e.g., from 3 × 3 to 7 × 7 and then 15 × 15) instead of downsampling. Inspired by an Inception module, INet extracts features by kernels of different sizes through concatenating the output feature maps of all preceding convolutional layers. We also find that the large kernel makes the network feasible for biomedical image segmentation. In addition, INet uses two overlapping max-poolings, i.e., max-poolings with stride 1, to extract the sharpest features. Fixed-size and fixed-channel feature maps enable INet to concatenate feature maps and add multiple shortcuts across layers. In this way, INet can recover low-level semantics by concatenating the feature maps of all preceding layers and expedite the training by adding multiple shortcuts. Because INet has additional residual shortcuts, we compare INet with a UNet system that also has residual shortcuts (ResUNet). To confirm INet as a backbone architecture for biomedical image segmentation, we implement dense connections on INet (called DenseINet) and compare it to a DenseUNet system with residual shortcuts (ResDenseUNet). INet and DenseINet require 16.9% and 37.6% fewer parameters than ResUNet and ResDenseUNet, respectively. In comparison with six encoder-decoder approaches using nine public datasets, INet and DenseINet demonstrate efficient improvements in biomedical image segmentation. INet outperforms DeepLabV3, which implementing atrous convolution instead of downsampling to increase receptive fields. INet also outperforms two recent methods (named HRNet and MS-NAS) that maintain high-resolution representations and repeatedly exchange the information across resolutions. INDEX TERMS Biomedical image, convolutional networks, encoder-decoder networks, semantic segmentation.
Background and Study Aims: Small polyps are occasionally missed during colonoscopy. This study was conducted to validate the diagnostic performance of a polyp-detection algorithm to alert endoscopists to unrecognized lesions. Methods: A computer-aided detection (CADe) algorithm was developed based on convolutional neural networks using training data from 1991 still colonoscopy images from 283 subjects with adenomatous polyps. The CADe algorithm was evaluated on a validation dataset including 50 short videos with 1-2 polyps (3.5 AE 1.5 mm, range 2-8 mm) and 50 videos without polyps. Two expert colonoscopists and two physicians in training separately read the same videos, blinded to the presence of polyps. The CADe algorithm was also evaluated using eight full videos with polyps and seven full videos without a polyp. Results: The per-video sensitivity of CADe for polyp detection was 88% and the per-frame false-positive rate was 2.8%, with a confidence level of ≥30%. The per-video sensitivity of both experts was 88%, and the sensitivities of the two physicians in training were 84% and 76%. For each reader, the frames with missed polyps appearing on short videos were significantly less than the frames with detected polyps, but no trends were observed regarding polyp size, morphology or color. For full video readings, per-polyp sensitivity was 100% with a per-frame false-positive rate of 1.7%, and per-frame specificity of 98.3%. Conclusions: The sensitivity of CADe to detect small polyps was almost equivalent to experts and superior to physicians in training. A clinical trial using CADe is warranted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.