Rendezvous: Attention mechanisms for the recognition of surgical action triplets in endoscopic videos

Nwoye, Chinedu Innocent; Yu, Tong; González, Cristians; Seeliger, Barbara; Mascagni, Pietro; Mutter, Didier; Marescaux, Jacques; Padoy, Nicolas

doi:10.1016/j.media.2022.102433

Cited by 69 publications

(64 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In the surgical action triplet recognition problem, the main task is to recognise the triplet 𝐼𝑉𝑇, which is the composition of three components In current state-of-the-art (SOTA) deep models [14], [6], there is a communal structure divided into three parts: i) the feature extraction backbone; ii) the individual component encoder; iii) the triplet aggregation decoder that associate the components and output the logits of the 𝐼𝑉𝑇 triplet. More precisely, the the individual component encoder firstly concentrate on the instrument component to output Class Activation Maps (CAMs ∈ R 𝐻 ×𝑊 ×𝐶 𝑑 ) and the logits 𝑌 𝑌 𝑌 𝐼 𝐼 𝐼 of the instrument classes; the CAMs are then associated with the verb and target components separately for their logits (𝑌 𝑌 𝑌 𝑉 𝑉 𝑉 and 𝑌 𝑌 𝑌 𝑇 𝑇 𝑇 ) to address the instrument-centric nature of the triplet.…”

Section: A Surgical Action Triplet Recognitionmentioning

confidence: 99%

“…We do that by analysing the feature based explanations via robustness. To do this, we consider the current three SOTA techniques for our study: Tripnet [14], Attention Tripnet and Rendezvous [6]. Moreover, we extensively investigate the repercussion of deep features using four widely used backbones ResNet-18, ResNet-50 [26], DenseNet-121 [27] and Swin Transformer [28].…”

Section: Hypothesis 21: Deep Features Are Key For Robustnessmentioning

confidence: 99%

“…𝐶 𝐼 = 6, 𝐶 𝑉 = 10, 𝐶 𝑇 = 15) generating 900 (6×10×25) potential combinations for triplet labels. To maximise the clinical utility, we utilise the top-100 combinations of relevant labels, which are selected by removing a large portion of spurious combinations according to class grouping and surgical relevance rating [6]. Each video contains around 2, 000 annotated frames extracted at 1 fps in RGB channels, leading to a total of 90, 489 recorded frames.…”

Section: A Dataset Description and Evaluation Protocolmentioning

confidence: 99%

“…Instead, 𝐴𝑃 𝑑 where 𝑑 ∈ {𝐼,𝑉,𝑇, 𝐼𝑉, 𝐼𝑇 } cannot be predicted explicitly. Then we obtain the final predictions of 𝑑 ∈ {𝐼,𝑉,𝑇, 𝐼𝑉, 𝐼𝑇 } components according to [34], [6]:…”

Section: A Dataset Description and Evaluation Protocolmentioning

confidence: 99%

“…Despite the benefits from MIS, surgeons loss direct vision and touch on the target, which decreases surgeon-patient transparency imposing technical challenges to the surgeon. These challenges have motivated the development of automatic techniques for the analysis of the surgical flow [3], [4], [5], [6]. In particular, this work addresses a key research problem in surgical science-surgical recognition, which provides contextaware support and safety.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Why Deep Surgical Models Fail?: Revisiting Surgical Action Triplet Recognition through the Lens of Robustness

Cheng¹,

Liu²,

Wang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Surgical action triplet recognition provides a better understanding of the surgical scene. This task is of high relevance as it provides to the surgeon with context-aware support and safety. The current go-to strategy for improving performance is the development of new network mechanisms. However, the performance of current state-of-the-art techniques is substantially lower than other surgical tasks. Why is this happening? This is the question that we address in this work. We present the first study to understand the failure of existing deep learning models through the lens of robustness and explainabilty. Firstly, we study current existing models under weak and strong 𝛿−perturbations via adversarial optimisation scheme. We then provide the failure modes via feature based explanations. Our study revels that the key for improving performance and increasing reliability is in the core and spurious attributes. Our work opens the door to more trustworthiness and reliability deep learning models in surgical science.

show abstract

Section: A Surgical Action Triplet Recognitionmentioning

confidence: 99%

Section: Hypothesis 21: Deep Features Are Key For Robustnessmentioning

confidence: 99%

Section: A Dataset Description and Evaluation Protocolmentioning

confidence: 99%

Section: A Dataset Description and Evaluation Protocolmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Why Deep Surgical Models Fail?: Revisiting Surgical Action Triplet Recognition through the Lens of Robustness

Cheng¹,

Liu²,

Wang³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

Automatisierte Klassifizierung von Schäden an Massivbrücken mittels Neuronaler Netze

Flotzinger

Braml

2022

Beton und Stahlbetonbau

View full text Add to dashboard Cite

Vor dem Hintergrund eines alternden Bauwerksbestands sowie des stetigen Anstiegs des Schwerverkehrs ist eine regelmäßige und qualitativ hochwertige Bauwerksprüfung unabdingbar. Bei der Bewältigung dieser Aufgabe birgt die Zuhilfenahme digitaler Methoden im Rahmen der digitalisierten Inspektion (DI) großes Verbesserungspotenzial in Hinblick auf Wirtschaftlichkeit und Qualität. Ein wesentlicher Bestandteil der DI ist das automatisierte Erkennen von Schäden mit Künstlichen Neuronalen Netzen. Im Rahmen des Forschungsprojekts „Modellbasierte digitale Bauwerksprüfung – MoBaP“ werden an der Universität der Bundeswehr München Neuronale Netze für die Klassifizierung von Schäden an Massivbrücken trainiert. Auf dem derzeit größten Open‐Source‐Datensatz (CODEBRIM) dieser Domäne erzielt das im Folgenden dargestellte Netz eine Exact Match Ratio von 74 % und definiert damit das aktuell beste Modell zur Multi‐Target‐Klassifizierung. Um auch Neuronale Netze für die Objektdetektion und semantische Segmentierung dieser Domäne zu trainieren, wird ein eigener Datensatz erstellt. Dadurch wird neben dem Klassifizieren auch das Lokalisieren der Schäden auf Bildern ermöglicht. In diesem Aufsatz erörtern die Autoren das Vorgehen zum Trainieren Neuronaler Netze für die Klassifizierung von Schäden an Massivbrücken und eine detaillierte Analyse von Testergebnissen. Außerdem werden die Entwicklung und der aktuelle Stand eines eigenen Datensatzes vorgestellt.

show abstract

Biomedical Big Data Technologies, Applications, and Challenges for Precision Medicine: A Review

Yang,

Huang,

Yang

et al. 2023

Global Challenges

View full text Add to dashboard Cite

The explosive growth of biomedical Big Data presents both significant opportunities and challenges in the realm of knowledge discovery and translational applications within precision medicine. Efficient management, analysis, and interpretation of big data can pave the way for groundbreaking advancements in precision medicine. However, the unprecedented strides in the automated collection of large‐scale molecular and clinical data have also introduced formidable challenges in terms of data analysis and interpretation, necessitating the development of novel computational approaches. Some potential challenges include the curse of dimensionality, data heterogeneity, missing data, class imbalance, and scalability issues. This overview article focuses on the recent progress and breakthroughs in the application of big data within precision medicine. Key aspects are summarized, including content, data sources, technologies, tools, challenges, and existing gaps. Nine fields—Datawarehouse and data management, electronic medical record, biomedical imaging informatics, Artificial intelligence‐aided surgical design and surgery optimization, omics data, health monitoring data, knowledge graph, public health informatics, and security and privacy—are discussed.

show abstract

Rendezvous: Attention mechanisms for the recognition of surgical action triplets in endoscopic videos

Cited by 69 publications

References 32 publications

Why Deep Surgical Models Fail?: Revisiting Surgical Action Triplet Recognition through the Lens of Robustness

Why Deep Surgical Models Fail?: Revisiting Surgical Action Triplet Recognition through the Lens of Robustness

Automatisierte Klassifizierung von Schäden an Massivbrücken mittels Neuronaler Netze

Biomedical Big Data Technologies, Applications, and Challenges for Precision Medicine: A Review

Contact Info

Product

Resources

About