2022
DOI: 10.1007/978-3-031-19778-9_38
|View full text |Cite
|
Sign up to set email alerts
|

My View is the Best View: Procedure Learning from Egocentric Videos

Abstract: Given multiple videos of the same task, procedure learning addresses identifying the key-steps and determining their order to perform the task. For this purpose, existing approaches use the signal generated from a pair of videos. This makes key-steps discovery challenging as the algorithms lack inter-videos perspective. Instead, we propose an unsupervised Graph-based Procedure Learning (GPL) framework. GPL consists of the novel UnityGraph that represents all the videos of a task as a graph to obtain both intra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 19 publications
(8 citation statements)
references
References 86 publications
0
8
0
Order By: Relevance
“…It has 11 actions and on average 33 segments per video. EgoProceL [3] is an egocentric dataset featuring diverse tasks, such as repairing cars, assembling toys and cooking. It has 1055 videos, 130 actions and on average 21 segments per video.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…It has 11 actions and on average 33 segments per video. EgoProceL [3] is an egocentric dataset featuring diverse tasks, such as repairing cars, assembling toys and cooking. It has 1055 videos, 130 actions and on average 21 segments per video.…”
Section: Methodsmentioning
confidence: 99%
“…The result of UVAST is not reported on EPIC-Kitchen as we found its sequence decoder has difficulty learning the large number of segments in the videos, thus cannot converge well. We include more implementation details in the supplementary materials 3 .…”
Section: Methodsmentioning
confidence: 99%
“…Another approach to egocentric action recognition is to consider it as a procedural problem and learn the key steps required to perform a task upon observing multiple egocentric videos as done in Bansal et al (2022). This work is restricted to procedural tasks but is a venue for exploration as opposed to recognising isolated actions.…”
Section: Action Recognitionmentioning
confidence: 99%
“…Procedure Learning from Instructional Videos. Recent works have attempted to learn procedures from instructional videos [2,5,13,19,27]. Most notably, [5] generates a sequence of actions given a start and a goal image.…”
Section: Related Workmentioning
confidence: 99%
“…Most notably, [5] generates a sequence of actions given a start and a goal image. [2] finds temporal correspondences between key steps across multiple videos while [19] distinguishes pairs of videos performing the same sequence of actions from negative ones. [13] uses distant supervision from WikiHow to localize steps in instructional videos.…”
Section: Related Workmentioning
confidence: 99%