In recent years, tremendous progress has been made in surgical practice for example with Minimally Invasive Surgery (MIS). To overcome challenges coming from deported eye-to-hand manipulation, robotic and computer-assisted systems have been developed. Having real-time knowledge of the pose of surgical tools with respect to the surgical camera and underlying anatomy is a key ingredient for such systems. In this paper, we present a review of the literature dealing with vision-based and marker-less surgical tool detection. This paper includes three primary contributions: (1) identification and analysis of data-sets used for developing and testing detection algorithms, (2) in-depth comparison of surgical tool detection methods from the feature extraction process to the model learning strategy and highlight existing shortcomings, and (3) analysis of validation techniques employed to obtain detection performance results and establish comparison between surgical tool detectors. The papers included in the review were selected through PubMed and Google Scholar searches using the keywords: "surgical tool detection", "surgical tool tracking", "surgical instrument detection" and "surgical instrument tracking" limiting results to the year range 2000 2015. Our study shows that despite significant progress over the years, the lack of established surgical tool data-sets, and reference format for performance assessment and method ranking is preventing faster improvement.
Abstract-Dexterity and procedural knowledge are two critical skills surgeons need to master to perform accurate and safe surgical interventions. However, current training systems do not allow to provide an in-depth analysis of surgical gestures to precisely assess these skills. Our objective is to develop a method for the automatic and quantitative assessment of surgical gestures. To reach this goal, we propose a new unsupervised algorithm that can automatically segment kinematic data from robotic training sessions. Without relying on any prior information or model, this algorithm detects critical points in the kinematic data which define relevant spatio-temporal segments. Based on the association of these segments, we obtain an accurate recognition of the gestures involved in the surgical training task. We then perform an advanced analysis and assess our algorithm using datasets recorded during real expert training sessions. After comparing our approach with the manual annotations of the surgical gestures, we observe 97.4% accuracy for the learning purpose and an average matching score of 81.9% for the fullyautomated gesture recognition process. Our results show that trainees workflow can be followed and surgical gestures may be automatically evaluated according to an expert database. This approach tends towards improving training efficiency by minimizing the learning curve.
Detecting tools in surgical videos is an important ingredient for context-aware computer-assisted surgical systems. To this end, we present a new surgical tool detection dataset and a method for joint tool detection and pose estimation in 2d images. Our two-stage pipeline is data-driven and relaxes strong assumptions made by previous works regarding the geometry, number, and position of tools in the image. The first stage classifies each pixel based on local appearance only, while the second stage evaluates a tool-specific shape template to enforce global shape. Both local appearance and global shape are learned from training data. Our method is validated on a new surgical tool dataset of 2 476 images from neurosurgical microscopes, which is made freely available. It improves over existing datasets in size, diversity and detail of annotation. We show that our method significantly improves over competitive baselines from the computer vision field. We achieve 15% detection miss-rate at 10(-1) false positives per image (for the suction tube) over our surgical tool dataset. Results indicate that performing semantic labelling as an intermediate task is key for high quality detection.
The need for a better integration of the new generation of computer-assisted-surgical systems has been recently emphasized. One necessity to achieve this objective is to retrieve data from the operating room (OR) with different sensors, then to derive models from these data. Recently, the use of videos from cameras in the OR has demonstrated its efficiency. In this paper, we propose a framework to assist in the development of systems for the automatic recognition of high-level surgical tasks using microscope videos analysis. We validated its use on cataract procedures. The idea is to combine state-of-the-art computer vision techniques with time series analysis. The first step of the framework consisted in the definition of several visual cues for extracting semantic information, therefore, characterizing each frame of the video. Five different pieces of image-based classifiers were, therefore, implemented. A step of pupil segmentation was also applied for dedicated visual cue detection. Time series classification algorithms were then applied to model time-varying data. Dynamic time warping and hidden Markov models were tested. This association combined the advantages of all methods for better understanding of the problem. The framework was finally validated through various studies. Six binary visual cues were chosen along with 12 phases to detect, obtaining accuracies of 94%.
Compared to the analysis based on one data type only, a combination of visual features and instrument signals allows better segmentation, reduction of the detection delay and discovery of the correct phase order.
The addition of human knowledge to traditional bottom-up approaches based on image analysis appears to be promising for low-level task detection. The results of this work could be used for the automatic indexation of post-operative videos.
PurposeSmaller incisions and reduced surgical trauma made minimally invasive surgery (MIS) grow in popularity even though long training is required to master the instrument manipulation constraints. While numerous training systems have been developed in the past, very few of them tackled fetal surgery and more specifically the treatment of twin-twin transfusion syndrome (TTTS). To address this lack of training resources, this paper presents a novel mixed-reality surgical trainer equipped with comprehensive sensing for TTTS procedures. The proposed trainer combines the benefits of box trainer technology and virtual reality systems. Face and content validation studies are presented and a use-case highlights the benefits of having embedded sensors.MethodsFace and content validity of the developed setup was assessed by asking surgeons from the field of fetal MIS to accomplish specific tasks on the trainer. A small use-case investigates whether the trainer sensors are able to distinguish between an easy and difficult scenario.ResultsThe trainer was deemed sufficiently realistic and its proposed tasks relevant for practicing the required motor skills. The use-case demonstrated that the motion and force sensing capabilities of the trainer were able to analyze surgical skill.ConclusionThe developed trainer for fetal laser surgery was validated by surgeons from a specialized center in fetal medicine. Further similar investigations in other centers are of interest, as well as quality improvements which will allow to increase the difficulty of the trainer. The comprehensive sensing appeared to be capable of objectively assessing skill.
Treatment decisions for patients with presumed glioblastoma are based on tumor characteristics available from a preoperative MR scan. Tumor characteristics, including volume, location, and resectability, are often estimated or manually delineated. This process is time consuming and subjective. Hence, comparison across cohorts, trials, or registries are subject to assessment bias. In this study, we propose a standardized Glioblastoma Surgery Imaging Reporting and Data System (GSI-RADS) based on an automated method of tumor segmentation that provides standard reports on tumor features that are potentially relevant for glioblastoma surgery. As clinical validation, we determine the agreement in extracted tumor features between the automated method and the current standard of manual segmentations from routine clinical MR scans before treatment. In an observational consecutive cohort of 1596 adult patients with a first time surgery of a glioblastoma from 13 institutions, we segmented gadolinium-enhanced tumor parts both by a human rater and by an automated algorithm. Tumor features were extracted from segmentations of both methods and compared to assess differences, concordance, and equivalence. The laterality, contralateral infiltration, and the laterality indices were in excellent agreement. The native and normalized tumor volumes had excellent agreement, consistency, and equivalence. Multifocality, but not the number of foci, had good agreement and equivalence. The location profiles of cortical and subcortical structures were in excellent agreement. The expected residual tumor volumes and resectability indices had excellent agreement, consistency, and equivalence. Tumor probability maps were in good agreement. In conclusion, automated segmentations are in excellent agreement with manual segmentations and practically equivalent regarding tumor features that are potentially relevant for neurosurgical purposes. Standard GSI-RADS reports can be generated by open access software.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.