2022
DOI: 10.48550/arxiv.2202.05821
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PEg TRAnsfer Workflow recognition challenge report: Does multi-modal data improve recognition?

Abstract: Background and Objective: Context-aware computer-assisted surgical systems require real time automatic and accurate surgical workflow recognition. Since several years, video has been the most common modality used to develop such methods. With the democratization of robotic-assisted surgery and segmentation method, new modalities are now accessible, such as kinematics. Some previous works used these new modalities as input of their models, but the added value of these modalities is rarely studied. This paper pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 30 publications
0
5
0
Order By: Relevance
“…Note that there are inconsistencies in label and granularity definitions across datasets. For example, the tasks of Suturing, Knot Tying, and Peg Transfer in JIGSAWS and DESK are considered phases in MISAW [31] and PETRAW [36]. [13] trained a GRU for gesture and maneuver recognition on the JIGSAWS and MISTIC-SL datasets, respectively.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Note that there are inconsistencies in label and granularity definitions across datasets. For example, the tasks of Suturing, Knot Tying, and Peg Transfer in JIGSAWS and DESK are considered phases in MISAW [31] and PETRAW [36]. [13] trained a GRU for gesture and maneuver recognition on the JIGSAWS and MISTIC-SL datasets, respectively.…”
Section: Related Workmentioning
confidence: 99%
“…Interestingly, [31] found that multi-granularity recognition models performed better because such models may be learning that certain activities only occur during specific phases and steps. Also, recent works on action triplet recognition in laparoscopic procedures focus on concurrent phase, step, and action recognition [36]. The poor performance of activity recognition models is a barrier to clinical applications, but understanding the relationship between granularity levels can address this challenge and guide model development.…”
Section: Related Workmentioning
confidence: 99%
“…It has been suggested that surgical gesture recognition can be learned from optical flow data alone, highlighting the importance of motion cues for action recognition [27]. Combining modalities from different sensors continues to be an active topic in the medical domain [13], as data stemming from various sources become ubiquitous in the OR. Fusion strategies for RGB-D data are being explored at length for many applications such as depth estimation [16], 6DoF pose estimation [26,3], and object classification [29].…”
Section: Toolsmentioning
confidence: 99%
“…The analysis of surgical videos is no longer limited to medical devices such as endoscopic cameras -in the past years, several works have explored the use of ceiling-mounted cameras in an effort to understand OR workflows from an outside perspective. As the amount of data stemming from OR sensors increases, new questions arise, such as how to best integrate various modalities into automated surgical systems [13] or where to optimally place cameras for specific tasks [11,18]. This study seeks to understand which camera modalities are best suited for surgical action recognition, exploring their relative performance in a unique set of multi-view surgical recordings.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, several research teams have worked on developing dataset at large scales [1,3,31], but most are only designed and annotated for one certain task. In terms of clinical applicability, data from different modalities are needed to better understand the whole scenario, make proper decisions, as well as enrich perception with multi-task learning strategy [9,16]. Besides, there are few datasets designed for automation tasks in surgical application, among which the automatic laparoscopic field-of-view (FoV) control is a popular topic as it can liberate the assistant from such tedious manipulations with the help from surgical robots [5].…”
Section: Introductionmentioning
confidence: 99%