Purpose Automated segmentation of anatomical structures in medical image analysis is a prerequisite for autonomous diagnosis as well as various computer and robot aided interventions. Recent methods based on deep convolutional neural networks (CNN) have outperformed former heuristic methods. However, those methods were primarily evaluated on rigid, real-world environments. In this study, existing segmentation methods were evaluated for their use on a new dataset of transoral endoscopic exploration. Methods Four machine learning based methods SegNet, UNet, ENet and ErfNet were trained with supervision on a novel 7-class dataset of the human larynx. The dataset contains 536 manually segmented images from two patients during laser incisions. The Intersection-over-Union (IoU) evaluation metric was used to measure the accuracy of each method. Data augmentation and network ensembling were employed to increase segmentation accuracy. Stochastic inference was used to show uncertainties of the individual models. Patient-to-patient transfer was investigated using patient-specific fine-tuning. Results In this study, a weighted average ensemble network of UNet and ErfNet was best suited for the segmentation of laryngeal soft tissue with a mean IoU of 84.7 %. The highest efficiency was achieved by ENet with a mean inference time of 9.22 ms per image. It is shown that 10 additional images from a new patient are sufficient for patient-specific fine-tuning. Conclusion CNN-based methods for semantic segmentation are applicable to endoscopic images of laryngeal soft tissue. The segmentation can be used for active constraints or to monitor morphological changes and autonomously detect pathologies. Further improvements could be achieved by using a larger dataset or training the models in a self-supervised manner on additional unlabeled data.
The presented algorithm successfully extends piecewise affine deformation tracking to stereo vision taking the epipolar constraint into account. Improved surgical performance as demonstrated for laser incision planning highlights the potential of presented method regarding further applications in computer-assisted surgery.
A common representation of volumetric medical image data is the triplanar view (TV), in which the surgeon manually selects slices showing the anatomical structure of interest. In addition to common medical imaging such as MRI or computed tomography, recent advances in the field of optical coherence tomography (OCT) have enabled live processing and volumetric rendering of four-dimensional images of the human body. Due to the region of interest undergoing motion, it is challenging for the surgeon to simultaneously keep track of an object by continuously adjusting the TV to desired slices. To select these slices in subsequent frames automatically, it is necessary to track movements of the volume of interest (VOI). This has not been addressed with respect to 4D-OCT images yet. Therefore, this paper evaluates motion tracking by applying state-of-the-art tracking schemes on maximum intensity projections (MIP) of 4D-OCT images. Estimated VOI location is used to conveniently show corresponding slices and to improve the MIPs by calculating thin-slab MIPs. Tracking performances are evaluated on an in-vivo sequence of human skin, captured at 26 volumes per second. Among investigated tracking schemes, our recently presented tracking scheme for soft tissue motion provides highest accuracy with an error of under 2.2 voxels for the first 80 volumes. Object tracking on 4D-OCT images enables its use for sub-epithelial tracking of microvessels for image-guidance.
In this work, we discuss epistemic uncertainty estimation obtained by Bayesian inference in diagnostic classifiers and show that the prediction uncertainty highly correlates with goodness of prediction. We train the ResNet-18 image classifier on a dataset of 84,484 optical coherence tomography scans showing four different retinal conditions. Dropout is added before every building block of ResNet, creating an approximation to a Bayesian classifier. Monte Carlo sampling is applied with dropout at test time for uncertainty estimation. In Monte Carlo experiments, multiple forward passes are performed to get a distribution of the class labels. The variance and the entropy of the distribution is used as metrics for uncertainty. Our results show strong correlation with ρ = 0.99 between prediction uncertainty and prediction error. Mean uncertainty of incorrectly diagnosed cases was significantly higher than mean uncertainty of correctly diagnosed cases. Modeling of the prediction uncertainty in computer-aided diagnosis with deep learning yields more reliable results and is therefore expected to increase patient safety. This will help to transfer such systems into clinical routine and to increase the acceptance of machine learning in diagnosis from the standpoint of physicians and patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.