2022 Symposium on Eye Tracking Research and Applications 2022
DOI: 10.1145/3517031.3529628
|View full text |Cite
|
Sign up to set email alerts
|

Can Gaze Inform Egocentric Action Recognition?

Abstract: We investigate the hypothesis that gaze-signal can improve egocentric action recognition on the standard benchmark, EGTEA Gaze++ dataset. In contrast to prior work where gaze-signal was only used during training, we formulate a novel neural fusion approach, Cross-modality Attention Blocks (CMA), to leverage gaze-signal for action recognition during inference as well. CMA combines information from different modalities at different levels of abstraction to achieve state-of-the-art performance for egocentric acti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 43 publications
(72 reference statements)
0
2
0
Order By: Relevance
“…Canedo et al (2018) presented a model that was theoretically capable of tracking the students' attention and gave an overview of the state-of-the-art on computer vision techniques for monitoring classrooms. Zhang et al (2022). constructed a neural fusion approach known as cross-modality attention blocks to capture gaze-signal for action recognition during inference.…”
Section: Students' Attention In Classroom Teachingmentioning
confidence: 99%
“…Canedo et al (2018) presented a model that was theoretically capable of tracking the students' attention and gave an overview of the state-of-the-art on computer vision techniques for monitoring classrooms. Zhang et al (2022). constructed a neural fusion approach known as cross-modality attention blocks to capture gaze-signal for action recognition during inference.…”
Section: Students' Attention In Classroom Teachingmentioning
confidence: 99%
“…Outdoor multi-human datasets like 3DPW [110] and MuPoTS [83] have constrained human activities and lack egocentric annotations [5,108], or are limited in diversity [109]. Existing egocentric datasets primarily focus on hand-object interactions and action recognition [3,19,20,27,53,54,56,61,67,85,87,92,101,104,118,128]. Recent datasets like Mo2Cap2 [115], You2Me [86], HPS [35] and EgoBody [123] focus on 3D human pose annotations -but are limited to one or two human subjects and indoor settings.…”
Section: Related Workmentioning
confidence: 99%