Instrument-tissue Interaction Quintuple Detection in Surgery Videos

Lin, Wen Wei; Hu, Yan; Hao, Luoying; Zhou, Dan; Yang, M.; Fu, Huazhu; Chui, Chee-Kong; Liu, Jiang

doi:10.1007/978-3-031-16449-1_38

Cited by 5 publications

(2 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Xu et al (2021) employed a transformer model along with adversarial learning to generate captions, akin to triplets, depicting semantic relationships between components involved in a surgical scene. Lin et al (2022) assigned instrument and target bounding boxes to triplet information and utilized a spatio-temporal graph for instrument-target interaction detection in cataract surgery.…”

Section: Action Triplet: From Recognition To Detectionmentioning

confidence: 99%

See 1 more Smart Citation

CholecTriplet2022: Show me a tool and tell me the triplet - an endoscopic vision challenge for surgical action triplet detection

Nwoye¹,

Tao²,

Sharma³

et al. 2023

Preprint

View full text Add to dashboard Cite

Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier efforts and the CholecTriplet challenge introduced in 2021 have put together techniques aimed at recognizing these triplets from surgical footage. Estimating also the spatial locations of the triplets would offer a more precise intraoperative context-aware decision support for computer-assisted intervention. This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection. It includes weakly-supervised bounding box localization of every visible surgical instrument (or tool), as the key actors, and the modeling of each tool-activity in the form of instrument, verb, target triplet. The paper describes a baseline method and 10 new deep learning algorithms presented at the challenge to solve the task. It also provides thorough methodological comparisons of the methods, an in-depth analysis of the obtained results, their significance, and useful insights for future research directions and applications in surgery.

show abstract

Section: Action Triplet: From Recognition To Detectionmentioning

confidence: 99%

“…Actual action localization was offered by the SARAS-ESAD dataset (Bawa et al, 2021;Lin et al, 2022), with bounding boxes pointing to action verbs being performed.…”

Section: Datasets: From Recognition To Detectionmentioning

confidence: 99%