DeepPhase: Surgical Phase Recognition in CATARACTS Videos

Zisimopoulos, Odysseas; Flouty, Evangello; Luengo, Imanol; Γιαταγάνας, Πέτρος; Nehme, Jean; Chow, Andre; Stoyanov, Danail

doi:10.1007/978-3-030-00937-3_31

Cited by 107 publications

(84 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The multi-task learning of tool and phase recognition requires the simultaneous annotations for both tasks on the same dataset, which restricts the development to a certain extent. Fortunately, most works regard it as a worthy trade-off: some label the two tasks with utilizing binary tool usage to address phase recognition task (Padoy et al (2012); Yu et al (2019)); others are dedicated to establishing more advanced multi-task strategies (Zisimopoulos et al (2018); Nakawala et al (2019)). In addition, more relevant datasets begin to be released for public usage, which alleviate the annotation problem to a great extent (Nakawala et al (2019); Stefanie et al (2018)).…”

Section: Discussionmentioning

confidence: 99%

“…Although this work has achieved outstanding performance, temporal dependencies, which are crucial for phase analysis, are detached from the unified framework. Zisimopoulos et al (2018) proposed to first train a ResNet to recognize tool presence and then combine the tool binary predictions and tool features from the last layer to train a RNN for phase recognition, which achieved promising results in cataract video analysis. Very recently, Nakawala et al (2019) present a Deep-Onto network which integrates deep models with ontology and production rules.…”

Section: Multi-task Learningmentioning

confidence: 99%

See 1 more Smart Citation

Multi-task recurrent convolutional network with correlation loss for surgical video analysis

Jin

Dou

et al. 2020

Medical Image Analysis

151

View full text Add to dashboard Cite

Surgical tool presence detection and surgical phase recognition are two fundamental yet challenging tasks in surgical video analysis and also very essential components in various applications in modern operating rooms. While these two analysis tasks are highly correlated in clinical practice as the surgical process is well-defined, most previous methods tackled them separately, without making full use of their relatedness. In this paper, we present a novel method by developing a multi-task recurrent convolutional network with correlation loss (MTRCNet-CL) to exploit their relatedness to simultaneously boost the performance of both tasks. Specifically, our proposed MTRCNet-CL model has an end-to-end architecture with two branches, which share earlier feature encoders to extract general visual features while holding respective higher layers targeting for specific tasks. Given that temporal information is crucial for phase recognition, long-short term memory (LSTM) is explored to model the sequential dependencies in the phase recognition branch. More importantly, a novel and effective correlation loss is designed to model the relatedness between tool presence and phase identification of each video frame, by minimizing the divergence of predictions from the two branches. Mutually leveraging both lowlevel feature sharing and high-level prediction correlating, our MTRCNet-CL method can encourage the interactions between the two tasks to a large extent, and hence can bring about benefits to each other. Extensive experiments on a large surgical video dataset (Cholec80) demonstrate outstanding performance of our proposed method, consistently exceeding the state-of-the-art methods by a large margin (e.g., 89.1% v.s. 81.0% for the mAP in tool presence detection and 87.4% v.s. 84.5% for F1 score in phase recognition). The code can be found on our project website.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Multi-task Learningmentioning

confidence: 99%

Multi-task recurrent convolutional network with correlation loss for surgical video analysis

Jin

Dou

et al. 2020

Medical Image Analysis

151

View full text Add to dashboard Cite

show abstract

“…To extract APMs from video, a number of building block algorithmic capabilities are important to develop, for example: detection of surgical instrument presence [50], delineation of surgical tools' position and motion [51], segmentation of surgical site into objects [52] or the video into key surgical steps [53,54], activity or significant event detection [55] as well as others like the detection of critical structures [56]. Marked progress in each of these building blocks of surgical process understanding has taken place in recent years, but a significant challenge is still the availability of large, well annotated datasets that can be used to evaluate systems in a fair and comparable manner.…”

Section: Ai and Machine Learning (Ml)mentioning

confidence: 99%

Evolving robotic surgery training and improving patient safety, with the integration of novel technologies

et al. 2020

Self Cite

View full text Add to dashboard Cite

Introduction Robot-assisted surgery is becoming increasingly adopted by multiple surgical specialties. There is evidence of inherent risks of utilising new technologies that are unfamiliar early in the learning curve. The development of standardised and validated training programmes is crucial to deliver safe introduction. In this review, we aim to evaluate the current evidence and opportunities to integrate novel technologies into modern digitalised robotic training curricula. Methods A systematic literature review of the current evidence for novel technologies in surgical training was conducted online and relevant publications and information were identified. Evaluation was made on how these technologies could further enable digitalisation of training. Results Overall, the quality of available studies was found to be low with current available evidence consisting largely of expert opinion, consensus statements and small qualitative studies. The review identified that there are several novel technologies already being utilised in robotic surgery training. There is also a trend towards standardised validated robotic training curricula. Currently, the majority of the validated curricula do not incorporate novel technologies and training is delivered with more traditional methods that includes centralisation of training services with wet laboratories that have access to cadavers and dedicated training robots. Conclusions Improvements to training standards and understanding performance data have good potential to significantly lower complications in patients. Digitalisation automates data collection and brings data together for analysis. Machine learning has potential to develop automated performance feedback for trainees. Digitalised training aims to build on the current gold standards and to further improve the ‘continuum of training’ by integrating PBP training, 3D-printed models, telementoring, telemetry and machine learning.

show abstract

“…The associate editor coordinating the review of this manuscript and approving it for publication was Ting Li . scheduling, and offline video indexing for educational purposes [7]. Hence, in this study, we focus on real-time surgical tool detection in videos.…”

Section: Introductionmentioning

confidence: 99%

An Anchor-Free Convolutional Neural Network for Real-Time Surgical Tool Detection in Robot-Assisted Surgery

et al. 2020

View full text Add to dashboard Cite

Robot-assisted surgery (RAS), a type of minimally invasive surgery, is used in a variety of clinical surgeries because it has a faster recovery rate and causes less pain. Automatic video analysis of RAS is an active research area, where precise surgical tool detection in real time is an important step. However, most deep learning methods currently employed for surgical tool detection are based on anchor boxes, which results in low detection speeds. In this paper, we propose an anchor-free convolutional neural network (CNN) architecture, a novel frame-by-frame method using a compact stacked hourglass network, which models the surgical tool as a single point: the center point of its bounding box. Our detector eliminates the need to design a set of anchor boxes, and is end-to-end differentiable, simpler, more accurate, and more efficient than anchor-box-based detectors. We believe our method is the first to incorporate the anchor-free idea for surgical tool detection in RAS videos. Experimental results show that our method achieves 98.5% mAP and 100% mAP at 37.0 fps on the ATLAS Dione and Endovis Challenge datasets, respectively, and truly realizes real-time surgical tool detection in RAS videos.

show abstract

DeepPhase: Surgical Phase Recognition in CATARACTS Videos

Cited by 107 publications

References 15 publications

Multi-task recurrent convolutional network with correlation loss for surgical video analysis

Multi-task recurrent convolutional network with correlation loss for surgical video analysis

Evolving robotic surgery training and improving patient safety, with the integration of novel technologies

An Anchor-Free Convolutional Neural Network for Real-Time Surgical Tool Detection in Robot-Assisted Surgery

Contact Info

Product

Resources

About