Abstract. We propose a method to find candidate 2D articulated model configurations by searching for locally optimal configurations under a weak but computationally manageable fitness function. This is accomplished by first parameterizing a tree structure by its joints. Candidate configurations can then efficiently and exhaustively be assembled in a bottom-up manner. Working from the leaves of the tree to its root, we maintain a list of locally optimal, yet sufficiently distinct candidate configurations for the body pose.We then adapt this algorithm for use on a sequence of images by considering configurations that are either near their position in the previous frame or overlap areas of interest in subsequent frames. This way, the number of partial configurations generated and evaluated significantly reduces while both smooth and abrupt motions can be accommodated. This approach is validated on test and standard datasets.
We describe an efficient and robust system to detect and track the limbs of a human. Of special consideration in the design of this system are real-time and robustness issues. We thus utilize a detection/tracking scheme in which we detect the face and limbs of a user and then track the forearms of the found limbs. Detection occurs by first finding the face of a user. The location and color information from the face can then be used to find limbs. As skin color is a key visual feature in this system, we continuously search for faces and use them to update skin color information. Along with edge information, this is used in the subsequent forearm tracking. Robustness is implicit in this design, as the system automatically re-detects a limbs when its corresponding forearms is lost. This design is also conducive to real-time processing because while detection of the limbs can take up to seconds, tracking is on the order of milliseconds. Thus reasonable frame rates can be achieved with a short latency. Also, in this system we make use of multiple 2D limb tracking models to enhance tracking of the underlying 3D structure. This includes models for lateral forearm views (waving) as well as for pointing gestures. Experiments on test sequences demonstrate the efficacy of this approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.