A.A. Sorokin scite author profile

Abstract. This paper proposes a metric learning based approach for human activity recognition with two main objectives: (1) reject unfamiliar activities and (2) learn with few examples. We show that our approach outperforms all state-of-the-art methods on numerous standard datasets for traditional action classification problem. Furthermore, we demonstrate that our method not only can accurately label activities but also can reject unseen activities and can learn from few examples with high accuracy. We finally show that our approach works well on noisy YouTube videos.

show abstract

People helping robots helping people: Crowdsourcing for grasping novel objects

Sorokin

Berenson

Srinivasa

et al. 2010

View full text Add to dashboard Cite

Abstract-For successful deployment, personal robots must adapt to ever-changing indoor environments. While dealing with novel objects is a largely unsolved challenge in AI, it is easy for people. In this paper we present a framework for robot supervision through Amazon Mechanical Turk. Unlike traditional models of teleoperation, people provide semantic information about the world and subjective judgements. The robot then autonomously utilizes the additional information to enhance its capabilities. The information can be collected on demand in large volumes and at low cost. We demonstrate our approach on the task of grasping unknown objects.

show abstract

Visualizing the History of Living Spaces

Ivanov

Wren

Sorokin³

et al. 2007

IEEE Trans. Visual. Comput. Graphics

View full text Add to dashboard Cite

The technology available to building designers now makes it possible to monitor buildings on a very large scale. Video cameras and motion sensors are commonplace in practically every office space, and are slowly making their way into living spaces. The application of such technologies, in particular video cameras, while improving security, also violates privacy. On the other hand, motion sensors, while being privacy-conscious, typically do not provide enough information for a human operator to maintain the same degree of awareness about the space that can be achieved by using video cameras. We propose a novel approach in which we use a large number of simple motion sensors and a small set of video cameras to monitor a large office space. In our system we deployed 215 motion sensors and six video cameras to monitor the 3,000-square-meter office space occupied by 80 people for a period of about one year. The main problem in operating such systems is finding a way to present this highly multidimensional data, which includes both spatial and temporal components, to a human operator to allow browsing and searching recorded data in an efficient and intuitive way. In this paper we present our experiences and the solutions that we have developed in the course of our work on the system. We consider this work to be the first step in helping designers and managers of building systems gain access to information about occupants' behavior in the context of an entire building in a way that is only minimally intrusive to the occupants' privacy.

show abstract

Tracking people in mixed modality systems

Ivanov

Sorokin

Wren

et al. 2007

View full text Add to dashboard Cite

In traditional surveillance systems tracking of objects is achieved by means of image and video processing. The disadvantages of such surveillance systems is that if an object needs to be tracked -it has to be observed by a video camera. However, geometries of indoor spaces typically require a large number of video cameras to provide the coverage necessary for robust operation of video-based tracking algorithms. Increased number of video streams increases the computational burden on the surveillance system in order to obtain robust tracking results. In this paper we present an approach to tracking in mixed modality systems, with a variety of sensors. The system described here includes over 200 motion sensors as well as 6 moving cameras. We track individuals in the entire space and across cameras using contextual information available from the motion sensors. Motion sensors allow us to almost instantaneously find plausible tracks in a very large volume of data, ranging in months, which for traditional video search approaches could be virtually impossible. We describe a method that allows us to evaluate when the tracking system is unreliable and present the data to a human operator for disambiguation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

A.A. Sorokin

Utility data annotation with Amazon Mechanical Turk

Human Activity Recognition with Metric Learning

People helping robots helping people: Crowdsourcing for grasping novel objects

Visualizing the History of Living Spaces

Tracking people in mixed modality systems

Contact Info

Product

Resources

About