Colorectal cancer is a major health problem, where advances towards computer-aided diagnosis (CAD) systems to assist the endoscopist can be a promising path to improvement. Here, a deep learning model for real-time polyp detection based on a pre-trained YOLOv3 (You Only Look Once) architecture and complemented with a post-processing step based on an object-tracking algorithm to reduce false positives is reported. The base YOLOv3 network was fine-tuned using a dataset composed of 28,576 images labelled with locations of 941 polyps that will be made public soon. In a frame-based evaluation using isolated images containing polyps, a general F1 score of 0.88 was achieved (recall = 0.87, precision = 0.89), with lower predictive performance in flat polyps, but higher for sessile, and pedunculated morphologies, as well as with the usage of narrow band imaging, whereas polyp size < 5 mm does not seem to have significant impact. In a polyp-based evaluation using polyp and normal mucosa videos, with a positive criterion defined as the presence of at least one 50-frames-length (window size) segment with a ratio of 75% of frames with predicted bounding boxes (frames positivity), 72.61% of sensitivity (95% CI 68.99–75.95) and 83.04% of specificity (95% CI 76.70–87.92) were achieved (Youden = 0.55, diagnostic odds ratio (DOR) = 12.98). When the positive criterion is less stringent (window size = 25, frames positivity = 50%), sensitivity reaches around 90% (sensitivity = 89.91%, 95% CI 87.20–91.94; specificity = 54.97%, 95% CI 47.49–62.24; Youden = 0.45; DOR = 10.76). The object-tracking algorithm has demonstrated a significant improvement in specificity whereas maintaining sensitivity, as well as a marginal impact on computational performance. These results suggest that the model could be effectively integrated into a CAD system.
Colorectal cancer is one of the most frequent malignancies. Colonoscopy is the de facto standard for precancerous lesion detection in the colon, i.e., polyps, during screening studies or after facultative recommendation. In recent years, artificial intelligence, and especially deep learning techniques such as convolutional neural networks, have been applied to polyp detection and localization in order to develop real-time CADe systems. However, the performance of machine learning models is very sensitive to changes in the nature of the testing instances, especially when trying to reproduce results for totally different datasets to those used for model development, i.e., inter-dataset testing. Here, we report the results of testing of our previously published polyp detection model using ten public colonoscopy image datasets and analyze them in the context of the results of other 20 state-of-the-art publications using the same datasets. The F1-score of our recently published model was 0.88 when evaluated on a private test partition, i.e., intra-dataset testing, but it decayed, on average, by 13.65% when tested on ten public datasets. In the published research, the average intra-dataset F1-score is 0.91, and we observed that it also decays in the inter-dataset setting to an average F1-score of 0.83.
Compi is an application framework to develop end-user, pipeline-based applications with a primary emphasis on: (i) user interface generation, by automatically generating a command-line interface based on the pipeline specific parameter definitions; (ii) application packaging, with compi-dk, which is a version-control-friendly tool to package the pipeline application and its dependencies into a Docker image; and (iii) application distribution provided through a public repository of Compi pipelines, named Compi Hub, which allows users to discover, browse and reuse them easily. By addressing these three aspects, Compi goes beyond traditional workflow engines, having been specially designed for researchers who want to take advantage of common workflow engine features (such as automatic job scheduling or logging, among others) while keeping the simplicity and readability of shell scripts without the need to learn a new programming language. Here we discuss the design of various pipelines developed with Compi to describe its main functionalities, as well as to highlight the similarities and differences with similar tools that are available. An open-source distribution under the Apache 2.0 License is available from GitHub (available at https://github.com/sing-group/compi). Documentation and installers are available from https://www.sing-group.org/compi. A specific repository for Compi pipelines is available from Compi Hub (available at https://www.sing-group.org/compihub.
Deep learning object-detection models are being successfully applied to develop computer-aided diagnosis systems for aiding polyp detection during colonoscopies. Here, we evidence the need to include negative samples for both (i) reducing false positives during the polyp-finding phase, by including images with artifacts that may confuse the detection models (e.g., medical instruments, water jets, feces, blood, excessive proximity of the camera to the colon wall, blurred images, etc.) that are usually not included in model development datasets, and (ii) correctly estimating a more realistic performance of the models. By retraining our previously developed YOLOv3-based detection model with a dataset that includes 15% of additional not-polyp images with a variety of artifacts, we were able to generally improve its F1 performance in our internal test datasets (from an average F1 of 0.869 to 0.893), which now include such type of images, as well as in four public datasets that include not-polyp images (from an average F1 of 0.695 to 0.722).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.