This paper presents a method for the detection and recognition of social interactions in a day-long first-person video of a social event, like a trip to an amusement park. The location and orientation of faces are estimated and used to compute the line of sight for each face. The context provided by all the faces in a frame is used to convert the lines of sight into locations in space to which individuals attend. Further, individuals are assigned roles based on their patterns of attention. The roles and locations of individuals are analyzed over time to detect and recognize the types of social interactions. In addition to patterns of face locations and attention, the head movements of the first-person can provide additional useful cues as to their attentional focus. We demonstrate encouraging results on detection and recognition of social interactions in first-person videos captured from multiple days of experience in amusement parks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.