Activity recognition from a wearable camera

Zhao, Kai; Ramos, Fábio; Faux, Steven

doi:10.1109/icarcv.2012.6485186

Cited by 26 publications

(63 citation statements)

References 15 publications

(14 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, motion in FPV of an ambulatory activity is generally dominated by a global motion on which discriminant features are extracted. Existing motion-features use either raw grid optical flow [8,11] or limited directional and/or magnitude information [12][13][14] . Motion patterns of activities can vary in their magnitude, direction and frequency characteristics [14] .…”

Section: Related Workmentioning

confidence: 99%

“…Application domains that employ wearable cameras ( Fig. 1 ) include life-logging and video summarization [3][4][5][6][7] , activity recognition [8][9][10][11][12][13][14][15][16][17][18][19][20][21] , and eye-tracking and gaze detection [22][23][24][25] . Human activities can be categorized as ambulatory (e.g., walk) [8][9][10][11][12][13][14][15] ; person-to-object interactions (e.g., cook) [16][17][18][19] ; and person-to-person interactions (e.g., handshake) [20,21] .…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Robust multi-dimensional motion features for first-person vision activity recognition

Abebe

Cavallaro

Parra

2016

Computer Vision and Image Understanding

View full text Add to dashboard Cite

We propose robust multi-dimensional motion features for human activity recognition from first-person videos. The proposed features encode information about motion magnitude, direction and variation, and combine them with virtual inertial data generated from the video itself. The use of grid flow representation, per-frame normalization and temporal feature accumulation enhances the robustness of our new representation. Results on multiple datasets demonstrate that the proposed feature representation outperforms existing motion features, and importantly it does so independently of the classifier. Moreover, the proposed multi-dimensional motion features are general enough to make them suitable for vision tasks beyond those related to wearable cameras. (C) 2015 The Authors. Published by Elsevier Inc.Peer ReviewedPostprint (published version

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Robust multi-dimensional motion features for first-person vision activity recognition

Abebe

Cavallaro

Parra

2016

Computer Vision and Image Understanding

View full text Add to dashboard Cite

show abstract

“…Video content-based camera motion analysis methods make use of template matching [1] and optical flow [6]. Methods derived from optical flow are widely used nowadays for human activity and action recognition from third person perspective [8,20] (where a fixed and static camera captures third person activities such that the optical flow is strongly associated with their activity) and first person perspective [30] (where camera wearer activities affect the global camera motion).…”

Section: Focused Interaction Datasetmentioning

confidence: 99%

“…Although audio signals provide information about social interactions, the fusion of visual and audio cues for detection of social interactions in egocentric video was rarely explored. Furthermore, the effect of integrating global camera motion analysis methods, nowadays used for human activity recognition in egocentric videos [30], with other visual and audio features for social interaction analysis still needs to be researched.…”

Section: Introductionmentioning

confidence: 99%

Finding Time Together: Detection and Classification of Focused Interaction in Egocentric Video

Bano

Zhang

McKenna

2017

2017 IEEE International Conference on Computer Vision Workshops (ICCVW)

View full text Add to dashboard Cite

Focused interaction occurs when co-present individuals, having mutual focus of attention, interact by establishing face-to-face engagement and direct conversation. Face-toface engagement is often not maintained throughout the entirety of a focused interaction. In this paper, we present an online method for automatic classification of unconstrained egocentric (first-person perspective) videos into segments having no focused interaction, focused interaction when the camera wearer is stationary and focused interaction when the camera wearer is moving. We extract features from both audio and video data streams and perform temporal segmentation by using support vector machines with linear and non-linear kernels. We provide empirical evidence that fusion of visual face track scores, camera motion profile and audio voice activity scores is an effective combination for focused interaction classification.

show abstract

“…These approaches include multistage recognition processes, and hence, recognition errors tend to be stacked. To avoid explicit object recognition, many studies use motion feature such as optical flow with a classifier such as LogitBoost and SVM [2,3,[26][27][28][29].…”

Section: First-person Activity Recognitionmentioning

confidence: 99%

First-person reading activity recognition by deep learning with synthetically generated images

Segawa¹,

Kawamoto

Okamoto

2018

J Image Video Proc.

View full text Add to dashboard Cite

We propose a vision-based method for recognizing first-person reading activity with deep learning. For the success of deep learning, it is well known that a large amount of training data plays a vital role. Unlike image classification, there are less publicly available datasets for reading activity recognition, and the collection of book images might cause copyright trouble. In this paper, we develop a synthetic approach for generating positive training images. Our approach synthesizes computer-generated images and real backround images. In experiments, we show that this synthesis is effective in combination with pre-trained deep convolutional neural networks and also our trained neural network outperforms other baselines.

show abstract

Activity recognition from a wearable camera

Cited by 26 publications

References 15 publications

Robust multi-dimensional motion features for first-person vision activity recognition

Robust multi-dimensional motion features for first-person vision activity recognition

Finding Time Together: Detection and Classification of Focused Interaction in Egocentric Video

First-person reading activity recognition by deep learning with synthetically generated images

Contact Info

Product

Resources

About