M. Harville scite author profile

We present an approach to real-time person tracking in crowded and/or unknown environments using multi-modal integration. We combine stereo, color, and face detection modules into a single robust system, and show an initial application in an interactive, face-responsive display. Dense, real-time stereo processing is used to isolate users from other objects and people in the background. Skin-hue classification identifies and tracks likely body parts within the silhouette of a user. Face pattern detection discriminates and localizes the face within the identified body parts. Faces and bodies of users are tracked over several temporal scales: short-term (user stays within the field of view), medium-term (user exits/reenters within minutes), and long term (user returns after hours or days). Short-term tracking is performed using simple region position and size correspondences, while medium and long-term tracking are based on statistics of user appearance. We discuss the failure modes of each individual module, describe our integration method, and report results with the complete system in trials with thousands of users.

show abstract

A Framework for High-Level Feedback to Adaptive, Per-Pixel, Mixture-of-Gaussian Background Models

Harville

2002

155

131

View full text Add to dashboard Cite

Abstract. Copyright 2002 Springer-Verlag. Published in the 7th European Conference on Computer Vision (ECCV-2002), May 28-31, 2002, Copenhagen, Denmark. Personal Time-Adaptive, Per-Pixel Mixtures Of Gaussians (TAPPMOGs) have recently become a popular choice for robust modeling and removal of complex and changing backgrounds at the pixel level. However, TAPPMOG-based methods cannot easily be made to model dynamic backgrounds with highly complex appearance, or to adapt promptly to sudden "uninteresting" scene changes such as the re-positioning of a static object or the turning on of a light, without further undermining their ability to segment foreground objects, such as people, where they occlude the background for too long. To alleviate tradeoffs such as these, and, more broadly, to allow TAPPMOG segmentation results to be tailored to the specific needs of an application, we introduce a general framework for guiding pixel-level TAPPMOG evolution with feedback from "high-level" modules. Each such module can use pixel-wise maps of positive and negative feedback to attempt to impress upon the TAPPMOG some definition of foreground that is best expressed through "higher-level" primitives such as image region properties or semantics of objects and events. By pooling the foreground error corrections of many high-level modules into a shared, pixel-level TAPPMOG model in this way, we improve the quality of the foreground segmentation and the performance of all modules that make use of it. We show an example of using this framework with a TAPPMOG method and high-level modules that all rely on dense depth data from a stereo camera.

show abstract

Stereo person tracking with adaptive plan-view templates of height and occupancy statistics

Harville

2004

Image and Vision Computing

118

View full text Add to dashboard Cite

Foreground segmentation using adaptive mixture models in color and depth

View full text Add to dashboard Cite

Segmentation of novel or dynamic objects in a scene, often referred to as "background subtraction" or "joreground segmentation", is a critical early in step in most computer vision applications in domains such as surveillance and human-computer interaction. All previously described, real-time methods fail to handle properly one or more common phenomena, such as global illumination changes, shadows, inter-rejections, similarity of foreground color to background, and non-static backgrounds (e.g. active video displays or trees waving in the wind). The recent advent of hardware and software for real-time computation of depth imagery makes better approaches possible. We propose a method for modeling the background that uses per-pixel, time-adaptive, Gaussian mixtures in the combined input space of depth and luminance-invariant colol: This combination in itself is novel, but we further improve it by introducing the ideas of I) modulating the background model learning rate based on scene activity, and 2 ) making colorbased segmentation criteria dependent on depth observations. Our experiments show that the method possesses much greater robustness to problematic phenomena than the prior state-of-the-art, without sacrijicing real-time performance, making it well-suited for a wide range of practical applications in video event detection and recognition.

show abstract

Background estimation and removal based on range and color

Gordon¹,

Darrell²,

Harville³

et al.

114

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

M. Harville

Integrated person tracking using stereo, color, and pattern detection

A Framework for High-Level Feedback to Adaptive, Per-Pixel, Mixture-of-Gaussian Background Models

Stereo person tracking with adaptive plan-view templates of height and occupancy statistics

Foreground segmentation using adaptive mixture models in color and depth

Background estimation and removal based on range and color

Contact Info

Product

Resources

About