Abstract-Tracking is a very important research subject in a real-time augmented reality context. The main requirements for trackers are high accuracy and little latency at a reasonable cost. In order to address these issues, a real-time, robust, and efficient 3D modelbased tracking algorithm is proposed for a "video see through" monocular vision system. The tracking of objects in the scene amounts to calculating the pose between the camera and the objects. Virtual objects can then be projected into the scene using the pose. Here, nonlinear pose estimation is formulated by means of a virtual visual servoing approach. In this context, the derivation of point-to-curves interaction matrices are given for different 3D geometrical primitives including straight lines, circles, cylinders, and spheres. A local moving edges tracker is used in order to provide real-time tracking of points normal to the object contours. Robustness is obtained by integrating an M-estimator into the visual control law via an iteratively reweighted least squares implementation. This approach is then extended to address the 3D model-free augmented reality problem. The method presented in this paper has been validated on several complex image sequences including outdoor environments. Results show the method to be robust to occlusion, changes in illumination, and mistracking.
This paper proposes a real-time, robust and effective tracking framework for visual servoing applications.The algorithm is based on the fusion of visual cues and on the estimation of a transformation (either a homography or a 3D pose). The parameters of this transformation are estimated using a non-linear minimization of a unique criterion that integrates information both on the texture and the edges of the tracked object. The proposed tracker is more robust and performs well in conditions where methods based on a single cue fail. The framework has been tested for 2D object motion estimation and pose computation. The method presented in this paper has been validated on several video sequences as well as in visual servoing experiments considering various objects. Results show the method to be robust to occlusions or textured backgrounds and suitable for visual servoing applications.
To cite this version:Muriel Pressigout, E. Marchand. Abstract-This paper proposes a real-time, robust and efficient 3D model-based tracking algorithm. A non linear minimization approach is used to register 2D and 3D cues for monocular 3D tracking. The integration of texture information in a more classical non-linear edge-based pose computation highly increases the reliability of more conventional edge-based 3D tracker. Robustness is enforced by integrating a M-estimator into the minimization process via an iteratively re-weighted least squares implementation. The method presented in this paper has been validated on several video sequences as well as in visual servoing experiments considering various objects. Results show the method to be robust to large motions and textured environments.
Abstract-A fundamental step towards broadening the use of real world image-based visual servoing is to deal with the important issues of reliability and robustness. In order to address this issue, a closed loop control law is proposed that simultaneously accomplishes a visual servoing task and is robust to a general class of external errors. This generality allows concurrent consideration of a wide range of errors including: noise from image feature extraction, small scale errors in the tracking and even large scale errors in the matching between current and desired features. This is achieved with the application of widely accepted statistical techniques of robust M-estimation. The M-estimator is integrated by an iteratively re-weighted method. The Median Absolute Deviation is used as an estimate of the standard deviation of the inlier data and is compared with other methods. This combination is advantageous because of its high efficiency, high breakdown point and desirable influence functions. The robustness and stability of the control law is shown to be dependent on a subsequent measure of position uncertainty. Furthermore the convergence criteria of the control law are investigated. Experimental results are presented which demonstrate visual servoing tasks which resist severe outlier contamination.
The efficient compression of multi-view-video-plus-depth (MVD) data raises the bit-rate allocation issue for the compression of texture and depth data. This question has not been solved yet because not all surveys reckon on a shared framework. This paper studies the impact of bit-rate allocation for texture and depth data relying on the quality of an intermediate synthesized view. The results show that depending on the acquisition configuration, the synthesized views require a different ratio between the depth and texture bit-rate: between 40% and 60% of the total bit-rate should be allocated to depth.Index Terms-3DTV, MVD, 3D video, MVC, quality assessment.1 which need to be generated in contexts of 3DTV (for rendering on autostereoscopic displays) or of FVV for rendering view points different from those captured by the cameras. The use of this objective metric is justified by its simplicity and mathematical easiness to deal with such purposes, although previous studies [3] highlighted the need for new metrics for 3D video. The appropriate rate ratio that should be used is not clearly stated in the literature: most of the studies do not rely on the same framework. [4] indicates that being a gray-scale signal, the depth video can be compressed more efficiently than the texture video using less than 20% of the texture bit-rate, in the sense of Advanced Three-Dimensional Television System Technologies (ATTEST) project, and for video-plus-depth data format. In [5], the authors proposed an efficient joint texture/depth rate allocation method based on a view synthesis model distortion, for the compression of MVD data. According the the bandwidth constraints, the method delivers the best quantization parameters combination for depth/texture sets that maximizes the rendering quality of a synthesized view in terms of MSE. Our experiments tried to quantify the appropriate rate ratio between depth and texture data, and then analyze the relationship with the encoded sequence. Section 2 is devoted to virtual view synthesis. Section 3 states the experimental protocol used to assess the influence of the texture/depth compression ratio while Section 4 discusses the results. Finally Section 5 concludes the paper. VIRTUAL VIEW SYNTHESISFor 3DTV or FVV, the transmitted texture and depth sequences are used to generate virtual views with the help of depth-image-based rendering techniques. The generated views can then be rendered on a conventional display, or a stereoscopic or an autosterescopic display. Generating a "virtual" view consists in synthesizing a novel view of the scene, from a viewpoint which differs from those captured by the cameras, relying on the available texture and depth data. The texture, that is the conventional 2D color sequences, gives the color information. The depth data are gray-scales images and are considered as a monochromatic signal. Each pixel of a depth image, also called depth map, indicates the distance of the corresponding 3D-point from
This paper addresses the problem of evaluating virtual view synthesized images in the multi-view video context. As a matter of fact, view synthesis brings new types of distortion. The question refers to the ability of the traditional used objective metrics to assess synthesized views quality, considering the new types of artifacts. The experiments conducted to determine their reliability consist in assessing seven different view synthesis algorithms. Subjective and objective measurements have been performed. Results show that the most commonly used objective metrics can be far from human judgment depending on the artifact to deal with
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.