2020
DOI: 10.48550/arxiv.2006.16166
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Automatic Operating Room Surgical Activity Recognition for Robot-Assisted Surgery

Abstract: Automatic recognition of surgical activities in the operating room (OR) is a key technology for creating next generation intelligent surgical devices and workflow monitoring/support systems. Such systems can potentially enhance efficiency in the OR, resulting in lower costs and improved care delivery to the patients. In this paper, we investigate automatic surgical activity recognition in robot-assisted operations. We collect the first large-scale dataset including 400 full-length multiperspective videos from … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(6 citation statements)
references
References 24 publications
0
5
0
Order By: Relevance
“…RGB and depth images are fused by concatenation to generate a four-channel image according to previous early fusion approaches [34]. Furthermore, we attempt to reproduce the depth-ir fusion from [30,28], by color-coding the depth data and alpha-blending them with the raw IR images. All images are augmented using a small random affine transformation followed by cropping the 224x224 pixel center.…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…RGB and depth images are fused by concatenation to generate a four-channel image according to previous early fusion approaches [34]. Furthermore, we attempt to reproduce the depth-ir fusion from [30,28], by color-coding the depth data and alpha-blending them with the raw IR images. All images are augmented using a small random affine transformation followed by cropping the 224x224 pixel center.…”
Section: Methodsmentioning
confidence: 99%
“…In these settings, action recognition is typically performed on short image sequences, referred to as "clips", which average only 10 seconds in length for Kinetics [17]. Surgical workflow analysis from room cameras is currently limited to one large dataset, which is not publicly available [30,28]. Despite publishing both RGB and depth sequences, the MVOR [31] dataset consists of 732 frames at relatively low FPS, making video action recognition infeasible.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations