Ian Spiro scite author profile

We describe a method for using crowd-sourced labor to track motion and ultimately annotate gestures of humans in video. Our chosen platform for deployment, Amazon Mechanical Turk, divides labor into HITs (Human Intelligence Tasks). Given the informational density of video, our task is potentially larger than a traditional HIT that involves processing a block of text or a single image. We exploit redundancies in video data in such a way that workers' efforts can be multiplied in effect. In the end, a fraction of frames need to be annotated by hand, but we can still achieve complete coverage of all video frames. This is achieved with a combination of HITs using a novel user interface, combined with automatic techniques such as template tracking and affinity propagation clustering. We show in a case study how we can annotate a video database of political speeches with 2D positions and 3D hand pose configurations. This data is then used for some preliminary analytical tasks.

show abstract

Cryptagram

Tierney

Spiro

Bregler

et al. 2013

View full text Add to dashboard Cite

Dancing with Turks

Chiang

Spiro

Lee

et al. 2015

View full text Add to dashboard Cite

Motion chain

Spiro

2012

View full text Add to dashboard Cite

This paper describes the development and preliminary design of a game with a purpose that attempts to build a corpus of useful and original videos of human motion. This content is intended for use in applications of machine learning and computer vision. The game, Motion Chain, encourages users to respond to text and video prompts by recording videos with a web camera. The game seeks to entertain not through an explicit achievement or point system but through the fun of performance and the discovery inherent in observing other players. This paper describes two specific forms of the game, Chains and Charades, and proposes future possibilities. The paper describes the phases of game design as well as implementation details then discusses an approach for evaluating the game's effectiveness.

show abstract

3D skeletal reconstruction from low-resolution multi-view images

Taylor

Spiro

Bregler

2012

View full text Add to dashboard Cite

This paper demonstrates how 3D skeletal reconstruction can be performed by using a pose-sensitive embedding technique applied to multi-view video recordings. We apply our approach to challenging low-resolution video sequences. Usually skeletal reconstruction can be only achieved with many calibrated high-resolution cameras, and only blob detection can be achieved with such low-resolution imagery. We show that with this embedding technique (a metric learning technique using a deep convolutional architecture), we achieve very good 3D skeletal reconstruction on low-resolution outdoor scenes with many challenges.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ian Spiro

Eye-mouse coordination patterns on web search results pages

Learning invariance through imitation

Hands by hand: Crowd-sourced motion tracking for gesture annotation

Cryptagram

Dancing with Turks

Motion chain

3D skeletal reconstruction from low-resolution multi-view images

Contact Info

Product

Resources

About