We present an approach for image retrieval using a very large number of highly selective features and efficient online learning. Our approach is predicated on the assumption that each image is generated by a sparse set of visual "causes" and that
Abstract. In this paper, we describe an unsupervised learning framework to segment a scene into semantic regions and to build semantic scene models from longterm observations of moving objects in the scene. First, we introduce two novel similarity measures for comparing trajectories in far-field visual surveillance. The measures simultaneously compare the spatial distribution of trajectories and other attributes, such as velocity and object size, along the trajectories. They also provide a comparison confidence measure which indicates how well the measured image-based similarity approximates true physical similarity. We also introduce novel clustering algorithms which use both similarity and comparison confidence. Based on the proposed similarity measures and clustering methods, a framework to learn semantic scene models by trajectory analysis is developed. Trajectories are first clustered into vehicles and pedestrians, and then further grouped based on spatial and velocity distributions. Different trajectory clusters represent different activities. The geometric and statistical models of structures in the scene, such as roads, walk paths, sources and sinks, are automatically learned from the trajectory clusters. Abnormal activities are detected using the semantic scene models. The system is robust to low-level tracking errors.
An ideal approach to the problem of pose-invariant face recognition would handle continuous pose variations, would not be database specific, and would achieve high accuracy without any manual intervention. Most of the existing approaches fail to match one or more of these goals. In this paper, we present a fully automatic system for pose-invariant face recognition that not only meets these requirements but also outperforms other comparable methods. We propose a 3D pose normalization method that is completely automatic and leverages the accurate 2D facial feature points found by the system. The current system can handle 3D pose variation up to +-45 in yaw and +-30 in pitch angles. Recognition experiments were conducted on the USF 3D, Multi-PIE, CMU-PIE, FERET, and FacePix databases. Our system not only shows excellent generalization by achieving high accuracy on all 5 databases but also outperforms other methods convincingly. International Conference on Computer Vision (ICCV)This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. AbstractAn ideal approach to the problem of pose-invariant face recognition would handle continuous pose variations, would not be database specific, and would achieve high accuracy without any manual intervention. Most of the existing approaches fail to match one or more of these goals. In this paper, we present a fully automatic system for poseinvariant face recognition that not only meets these requirements but also outperforms other comparable methods. We propose a 3D pose normalization method that is completely automatic and leverages the accurate 2D facial feature points found by the system. The current system can handle 3D pose variation up to ±45• in yaw and ±30• in pitch angles. Recognition experiments were conducted on the USF 3D, Multi-PIE, CMU-PIE, FERET, and FacePix databases. Our system not only shows excellent generalization by achieving high accuracy on all 5 databases but also outperforms other methods convincingly.
We present an approach for image retrieval using a very large number of highly selective features and efficient online learning. Our approach is predicated on the assumption that each image is generated by a sparse set of visual "causes" and that
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.