2015
DOI: 10.1007/978-3-319-16811-1_21
|View full text |Cite
|
Sign up to set email alerts
|

Head Motion Signatures from Egocentric Videos

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
23
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 25 publications
(25 citation statements)
references
References 13 publications
2
23
0
Order By: Relevance
“…Table 1 clearly indicates that our approach of learning the shared embedding space for first-and third-person videos significantly outperforms the baselines. Unlike previous work relying on classic hand-crafted features like head trajectories (e.g., [19]), our method learns the optimal embedding representation from training data in an endto-end fashion, yielding a major increase in accuracy. We also compared our Siamese and semi-Siamese architectures against the model of not sharing any layers between firstperson and third-person branches (Not-Siamese in Table 1), showing that semi-Siamese yields better accuracy.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…Table 1 clearly indicates that our approach of learning the shared embedding space for first-and third-person videos significantly outperforms the baselines. Unlike previous work relying on classic hand-crafted features like head trajectories (e.g., [19]), our method learns the optimal embedding representation from training data in an endto-end fashion, yielding a major increase in accuracy. We also compared our Siamese and semi-Siamese architectures against the model of not sharing any layers between firstperson and third-person branches (Not-Siamese in Table 1), showing that semi-Siamese yields better accuracy.…”
Section: Resultsmentioning
confidence: 99%
“…We believe this is the first paper to formulate first-and third-person video correspondence as an embedding space learning problem and to present an end-to-end learning approach. Unlike previous work [19,28] which uses handcoded trajectory features to match videos without any embedding learning, our method is applicable in more complex environments (e.g. with arbitrarily placed first-and thirdperson cameras and arbitrary numbers of people).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…To the best of our knowledge, most of the first-person video datasets comprise scenes where only a limited number of people are observed, e.g., CMU Social Interaction Dataset [27], JPL Interaction Dataset [34], HUJI EgoSeg Dataset [29]. In this work, we introduce a new dataset which we call First-Person Locomotion (FPL) Dataset.…”
Section: First-person Locomotion Datasetmentioning
confidence: 99%