We propose Vision-based Robust Calibration (ViRC) method for OSTHMDs equipped with a camera. In the ViRC method, calibration parameters are decomposed into off-line parameters that remain constant relative to the positional relationship between the camera and the virtual screen, and on-line parameters related to the user's eye. Calculating the off-line parameters beforehand reduces the number of unknown parameters in the on-line phase, giving robust protection against the user's misalignments during calibration. In the off-line phase, the approximate position of the user's eye is calculated using the PnP algorithm. In the online phase, the actual position of the user's eye is estimated from the approximate one by non-linear minimization. In our experiments, we show that the ViRC method can decrease reprojection error by as much as 83% compared with the conventional method based on the DLT algorithm.
Augmented reality using optical see-through head-mounted displays (OSTHMDs) provides the user with a highly realistic experience compared to those using smartphones or tablet devices. It is necessary for the positional relationship between the user's eye and a virtual screen to be calibrated using input from the user. However, conventional calibration methods are highly sensitive to input errors. In this paper, we propose a vision-based robust calibration (ViRC) method using a fiducial marker, which can be used for any OSTHMD equipped with a camera. The ViRC method decomposes 11-DoFs calibration parameters into device-dependent parameters and user-dependent parameters. Once the device-dependent parameters are calculated, the user only has to perform a calibration phase for estimating the 4-DoFs user-dependent parameters. Experiments show that the ViRC method can decrease reprojection error by 83% compared with the conventional method. Consequently, users can observe correctly aligned superimpositions of computer graphics with little distortion.
Abstract. Human detection technologies are very useful tools to understand human activity for various purposes, such as surveillance. Recently, trackingby-detection methods have also become popular for analyzing human activity, but their performance is greatly affected by the accuracy of detected human areas because they use online learning based on the detected results. In order to improve the performance of such tracking methods, the inclination of human bodies in the image is considered as a way to refine the detected human bounding boxes. Based on background subtraction and a novel scheme of estimating human foot position, a refinement scheme is proposed to estimate a bounding box more accurately, which can better fit the contours of inclined human bodies than the conventional method. Experimental results illustrated that the bounding boxes refined by the proposed algorithm achieved a higher cover rate of 92.7 % and a smaller mean angle error of 0.7° compared with the cover rate of 83.7 % and mean angle error of 3.8° obtained using the conventional method, as determined by comparison with the ground truth, and a real-time detection speed of 32.3 fps on a 640 × 480 video has been realized. Thus, tracking performance is significantly enhanced by refining the human areas, with a mean improvement of 42.4 % in the F-measure when compared with the conventional method. [3], and the CENTRIST feature [4]. The effectiveness of these methods has been proven in practice for the detection of upright complete humans. With the development of human detection technologies, an approach called tracking-by-detection [5] has become popular recently. This approach treats the tracking problem as a detection task applied over time. Such a method learns classifiers for tracking online using detected Human Area Refinement for Human Detection 131 human bounding boxes (b-boxes) instead of using offline labeled data for training, and thus the quality of the classifiers is greatly affected by the accuracy of the detected human areas, which contributes to the final tracking performance.Although most detection methods can provide a high detection rate, accurate depiction of human postures and regions still cannot be achieved, i.e., all existing methods can only detect approximate human locations denoted by upright b-boxes, and cannot deal with the contour of an inclined human body very well. In order to improve the accuracy of the detected human areas, in this paper we propose a refinement algorithm for the detected human bounding box (b-box) to fit the contour of the inclined human body based on background subtraction, human detection, and a novel scheme of estimating human head and foot position using a predefined human height.The rest of this paper is organized as follows. Section 2 briefly introduces related work. Section 3 describes the details of the proposed approach. Section 4 presents the experimental results and discussion, and Section 5 concludes the paper. In addition, real-time detection has attracted more and more attention,...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.