We present a fully unsupervised approach for the discovery of i) task relevant objects and ii) how these objects have been used. A Task Relevant Object (TRO) is an object, or part of an object, with which a person interacts during task performance. Given egocentric video from multiple operators, the approach can discover objects with which the users interact, both static objects such as a coffee machine as well as movable ones such as a cup. Importantly, we also introduce the term Mode of Interaction (MOI) to refer to the different ways in which TROs are used. Say, a cup can be lifted, washed, or poured into. When harvesting interactions with the same object from multiple operators, common MOIs can be found. Setup and Dataset: Using a wearable camera and gaze tracker (Mobile Eye-XG from ASL), egocentric video is collected of users performing tasks, along with their gaze in pixel coordinates. Six locations were chosen: kitchen, workspace, laser printer, corridor with a locked door, cardiac gym and weight-lifting machine. The Bristol Egocentric Object Interactions Dataset is publically available 1 . Discovering TROs: Given a sequence of images {I 1 , .., I T } collected from multiple operators around a common environment, we aim to extract K TROs, where each object T RO k is represented by the images from the sequence that feature the object of interest . We investigate using appearance, position and attention, and present results using each and a combination of relevant features. For attention, we exploit the high quality and predictive nature of eye gaze fixations.Results compare k-means clustering to spectral clustering, and propose estimating the optimal number of clusters using the standard DaviesBouldin (DB) index. Figure 2 shows the best performance for discovering TROs by combining position (relative to a map of the scene) and appearance (HOG features within BoW) over a sliding window w = 25, using gaze fixations for attention, spectral clustering and estimating the number of clusters using the Davies-Bouldin (DB) index. Finding MOIs: Given consecutive images (I t , I t+1 ,
We present a method for the learning and detection of multiple rigid texture-less 3D objects intended to operate at frame rate speeds for video input. The method is geared for fast and scalable learning and detection by combining tractable extraction of edgelet constellations with library lookup based on rotation-and scale-invariant descriptors. The approach learns object views in real-time, and is generative -enabling more objects to be learnt without the need for re-training. During testing, a random sample of edgelet constellations is tested for the presence of known objects. We perform testing of single and multi-object detection on a 30 objects dataset showing detections of any of them within milliseconds from the object's visibility. The results show the scalability of the approach and its framerate performance.
We describe a particle filtering method for vision based tracking of a hand held calibrated camera in real-time. The ability of the particle filter to deal with non-linearities and non-Gaussian statistics suggests the potential to provide improved robustness over existing approaches, such as those based on the Kalman filter. In our approach, the particle filter provides recursive approximations to the posterior density for the 3-D motion parameters. The measurements are inlier/outlier counts of likely correspondence matches for a set of salient points in the scene. The algorithm is simple to implement and we present results illustrating good tracking performance using a 'live' camera. We also demonstrate the potential robustness of the method, including the ability to recover from loss of track and to deal with severe occlusion.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.