2015
DOI: 10.1007/s11263-015-0851-8
|View full text |Cite
|
Sign up to set email alerts
|

Recognizing Fine-Grained and Composite Activities Using Hand-Centric Features and Script Data

Abstract: Activity recognition has shown impressive progress in recent years. However, the challenges of detecting fine-grained activities and understanding how they are combined into composite activities have been largely overlooked. In this work we approach both tasks and present a dataset which provides detailed annotations to address them. The first challenge is to detect fine-grained activities, which are defined by low interclass variability and are typically characterized by finegrained body motions. We explore h… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
113
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 147 publications
(113 citation statements)
references
References 96 publications
0
113
0
Order By: Relevance
“…In [10,22], human actions are modeled as space-time structures, using deformable part models [2]. In [15,18] discriminative handcentric features are explored for fine grained activity detection in cooking, i.e., relatively short sub-activities such as chop and fill. In [3], the detector is trained on CNN features extracted from the action tubes in space-time; however, evaluation is on relatively short video clips (i.e., several hundred frames) of relatively short actions.…”
Section: Related Workmentioning
confidence: 99%
“…In [10,22], human actions are modeled as space-time structures, using deformable part models [2]. In [15,18] discriminative handcentric features are explored for fine grained activity detection in cooking, i.e., relatively short sub-activities such as chop and fill. In [3], the detector is trained on CNN features extracted from the action tubes in space-time; however, evaluation is on relatively short video clips (i.e., several hundred frames) of relatively short actions.…”
Section: Related Workmentioning
confidence: 99%
“…Understanding human activities as a fine-grained recognition problem has been explored for some domain specific tasks [20,24]. For example, some works have been proposed for hand-gesture recognition [10,17,22], daily life activity recognition [21] and sports understanding [7,27,3,1]. All these works build ad hoc solutions specific to the action domain they are addressing.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, using only frame wise metrics is insufficient to fully describe performance. Considering this, we also use the segmentation metrics: mean average precision with midpoint hit criterion (mAP@mid) [45,39], Segmental F1 score (F1@k) [23] and segmental edit score (edit) [25].…”
Section: Metricsmentioning
confidence: 99%