Learning laparoscopic video shot classification for gynecological surgery

Petscharnig, Stefan; Schöffmann, Klaus

doi:10.1007/s11042-017-4699-5

Cited by 53 publications

(22 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While the aforementioned methods on video classification have achieved state-of-the-art performance on benchmark video recognition datasets like UCF-101 and Sports-1M, work on surgical video recognition has received little attention primarily due to the dearth of large-scale datasets in this domain. Previous works in surgical video analysis have addressed problems like surgical phase recognition [14], surgical gesture classification [15], tool tracking [16], classification of anatomical structures and surgical actions from video shots [17]. The most relevant to this work is the paper by Twinanda et al [4] in which they propose a pipeline for classification of the type of Laparoscopic video, which consists of frame rejection, feature extraction, feature encoding and classification.…”

Section: B Surgery Classificationmentioning

confidence: 99%

Future-State Predicting LSTM for Early Surgery Type Recognition

Kannan

Yengera

Mutter

et al. 2020

IEEE Trans. Med. Imaging

View full text Add to dashboard Cite

This work presents a novel approach for the early recognition of the type of a laparoscopic surgery from its video. Early recognition algorithms can be beneficial to the development of 'smart' OR systems that can provide automatic context-aware assistance, and also enable quick database indexing. The task is however ridden with challenges specific to videos belonging to the domain of laparoscopy, such as high visual similarity across surgeries and large variations in video durations. To capture the spatio-temporal dependencies in these videos, we choose as our model a combination of a Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) network. We then propose two complementary approaches for improving early recognition performance. The first approach is a CNN fine-tuning method that encourages surgeries to be distinguished based on the initial frames of laparoscopic videos. The second approach, referred to as 'Future-State Predicting LSTM ', trains an LSTM to predict information related to future frames, which helps in distinguishing between the different types of surgeries. We evaluate our approaches on a large dataset of 425 laparoscopic videos containing 9 types of surgeries (Laparo425), and achieve on average an accuracy of 75% having observed only the first 10 minutes of a surgery. These results are quite promising from a practical standpoint and also encouraging for other types of image-guided surgeries.

show abstract

Section: B Surgery Classificationmentioning

confidence: 99%

Future-State Predicting LSTM for Early Surgery Type Recognition

Kannan

Yengera

Mutter

et al. 2020

IEEE Trans. Med. Imaging

View full text Add to dashboard Cite

show abstract

“…With the comparison of adjacent color histograms and thresholds for significant motion changes, they are able to detect such keypoint moments in laparoscopic surgeries. Content classification has also been addressed recently by Petscharnig and Schoeffmann [37,38], who evaluate well-known convolutional neural network architectures for the purpose of semantic segment annotation.…”

Section: Related Workmentioning

confidence: 99%

Video retrieval in laparoscopic video recordings with dynamic content descriptors

Schoeffmann

Husslein

Kletz

et al. 2017

Multimed Tools Appl

Self Cite

View full text Add to dashboard Cite

In the domain of gynecologic surgery an increasing number of surgeries are performed in a minimally invasive manner. These laparoscopic surgeries require specific psychomotor skills of the operating surgeon, which are difficult to learn and teach. This is the reason why an increasing number of surgeons promote checking video recordings of laparoscopic surgeries for the occurrence of technical errors with surgical actions. This manual surgical quality assessment (SQA) process, however, is very cumbersome and timeconsuming when carried out without any support from content-based video retrieval. Appl (2018) 77:16813-16832 Descriptor) that can be effectively used to find similar segments in a laparoscopic video database and thereby help surgeons to more quickly inspect other instances of a given error scene. We evaluate the retrieval performance of MIDD with surgical actions from gynecologic surgery in direct comparison to several other dynamic content descriptors. We show that the MIDD descriptor significantly outperforms the state-of-the-art in terms of retrieval performance as well as in terms of runtime performance. Additionally, we release the manually created video dataset of 16 classes of surgical actions from medical laparoscopy to the public, for further evaluations.

show abstract

“…AlexNet is an architecture based on CNNs that has proven successful in scene classification tasks. It is recognized as an excellent basic level, automatic scene classification technology . While a typical CNN pooling process is non‐overlapping, AlexNet, dose in fact have an overlapping pooling process.…”

Section: Introductionmentioning

confidence: 99%

“…It is recognized as an excellent basic level, automatic scene classification technology. 53,54 While a typical CNN pooling process is non-overlapping, AlexNet, dose in fact have an overlapping pooling process. This contributes to a higher classification accuracy because more original information is retained.…”

Section: Introductionmentioning

confidence: 99%

DC‐AL GAN: Pseudoprogression and true tumor progression of glioblastoma multiform image classification based on DCGAN and AlexNet

Chan

Zhou

et al. 2020

Medical Physics

View full text Add to dashboard Cite

Purpose Pseudoprogression (PsP) occurs in 20–30% of patients with glioblastoma multiforme (GBM) after receiving the standard treatment. PsP exhibits similarities in shape and intensity to the true tumor progression (TTP) of GBM on the follow‐up magnetic resonance imaging (MRI). These similarities pose challenges to the differentiation of these types of progression and hence the selection of the appropriate clinical treatment strategy. Methods To address this challenge, we introduced a novel feature learning method based on deep convolutional generative adversarial network (DCGAN) and AlexNet, termed DC‐AL GAN, to discriminate between PsP and TTP in MRI images. Due to the adversarial relationship between the generator and the discriminator of DCGAN, high‐level discriminative features of PsP and TTP can be derived for the discriminator with AlexNet. We also constructed a multifeature selection module to concatenate features from different layers, contributing to more powerful features used for effectively discriminating between PsP and TTP. Finally, these discriminative features from the discriminator are used for classification by a support vector machine (SVM). Tenfold cross‐validation (CV) and the area under the receiver operating characteristic (AUC) were applied to evaluate the performance of this developed algorithm. Results The accuracy and AUC of DC‐AL GAN for discriminating PsP and TTP after tenfold CV were 0.920 and 0.947. We also assessed the effects of different indicators (such as sensitivity and specificity) for features extracted from different layers to obtain a model with the best classification performance. Conclusions The proposed model DC‐AL GAN is capable of learning discriminative representations from GBM datasets, and it achieves desirable PsP and TTP classification performance superior to other state‐of‐the‐art methods. Therefore, the developed model would be useful in the diagnosis of PsP and TTP for GBM.

show abstract

Learning laparoscopic video shot classification for gynecological surgery

Cited by 53 publications

References 29 publications

Future-State Predicting LSTM for Early Surgery Type Recognition

Future-State Predicting LSTM for Early Surgery Type Recognition

Video retrieval in laparoscopic video recordings with dynamic content descriptors

DC‐AL GAN: Pseudoprogression and true tumor progression of glioblastoma multiform image classification based on DCGAN and AlexNet

Contact Info

Product

Resources

About