2021
DOI: 10.1007/978-3-030-87202-1_60
|View full text |Cite
|
Sign up to set email alerts
|

Multi-view Surgical Video Action Detection via Mixed Global View Attention

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
12
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(17 citation statements)
references
References 25 publications
0
12
0
Order By: Relevance
“…In these settings, action recognition is typically performed on short image sequences, referred to as "clips", which average only 10 seconds in length for Kinetics [17]. Surgical workflow analysis from room cameras is currently limited to one large dataset, which is not publicly available [30,28]. Despite publishing both RGB and depth sequences, the MVOR [31] dataset consists of 732 frames at relatively low FPS, making video action recognition infeasible.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…In these settings, action recognition is typically performed on short image sequences, referred to as "clips", which average only 10 seconds in length for Kinetics [17]. Surgical workflow analysis from room cameras is currently limited to one large dataset, which is not publicly available [30,28]. Despite publishing both RGB and depth sequences, the MVOR [31] dataset consists of 732 frames at relatively low FPS, making video action recognition infeasible.…”
Section: Related Workmentioning
confidence: 99%
“…In the medical domain, events of interest may appear indistinguishable on such a short timescale, requiring the aggregation of additional temporal information. Cliplevel architectures, however, remain relevant even for longer videos, as they can be similarly combined with LSTMs to aggregate both short and long-term image features [28]. Extracting clip features can provide performance benefits as they reduce dimensionality compared to image level features [30].…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Several works have used multi-angle photography during MIS, with the internal camera capturing the tools being used, and the other cameras identifying the area surrounding the surgery. Using such systems enables detection of automatic activity [15], and inoperative errors and distractions, as well as the evaluation of skill level [16,17]. In another open surgery study, multiple cameras were placed in the surgical lamp [18,19].…”
Section: Introductionmentioning
confidence: 99%