This paper describes a new algorithm for recovering the 3D shape and motion of deformable and articulated objects purely from uncalibrated 2D image measurements using an iterative factorization approach. Most solutions to nonrigid and articulated structure from motion require metric constraints to be enforced on the motion matrix to solve for the transformation that upgrades the solution to metric space. While in the case of rigid structure the metric upgrade step is simple since the motion constraints are linear, deformability in the shape introduces non-linearities. In this paper we propose an alternating least-squares approach associated with a globally optimal projection step onto the manifold of metric constraints. An important advantage of this new algorithm is its ability to handle missing data which becomes crucial when dealing with real video sequences with self-occlusions. We show successful results of our algorithms on synthetic and real sequences of both deformable and articulated data.
This paper describes novel algorithms for recovering the 3D shape and motion of deformable and articulated objects purely from uncalibrated 2D image measurements using a factorisation approach. Most approaches to deformable and articulated structure from motion require to upgrade an initial affine solution to Euclidean space by imposing metric constraints on the motion matrix. While in the case of rigid structure the metric upgrade step is simple since the constraints can be formulated as linear, deformability in the shape introduces non-linearities. In this paper we propose an alternating bilinear approach to solve for non-rigid 3D shape and motion, associated with a globally optimal projection step of the motion matrices onto the manifold of metric constraints. Our novel optimal projection step combines into a single optimisation the computation of the orthographic projection matrix and the configuration weights that give the closest motion matrix that satisfies the correct block structure with the additional constraint that the projection matrix is guaranteed to have orthonormal rows (i.e. its transpose lies on the Stiefel manifold). This constraint turns out to be non-convex. The key contribution of this work is to introduce an efficient convex relaxation for the non-convex projection step. Efficient in the sense that, for both the cases of deformable and articulated motion, the proposed relaxations turned out to be exact (i.e. tight) in all our numerical experiments. The convex relaxations are semi-definite (SDP) or second-order cone (SOCP) programs which can be readily tackled by popular solvers. An important advantage of these new algorithms is their ability to handle missing data which becomes crucial when dealing with real video sequences with self-occlusions. We show successful results of our algorithms on synthetic and real sequences of both deformable and articulated data. We also show comparative results with state of the art algorithms which reveal that our new methods outperform existing ones.
Tracking non-rigid objects from video is useful in robotic systems such as HMIs or robotic manipulator arms which interact with deformable objects. This paper proposes a method for sequential model-based 3D reconstruction of deformable objects and camera localization in real time. Nonrigid SFM methods commonly process a video sequence offline in a batch way. While there are real-time methods for rigid models, reconstruction of deformable 3D shapes for real-time applications is still unsolved. Dense approaches offer promising results, but processing all frames in batch, offline. We propose a real-time non-rigid reconstruction method based on a known deformable model. Object shape and pose is tracked by realtime estimation of camera pose and deformation coefficients. An extensive evaluation of the algorithm on several data sets, and comparison with state-of-the-art techniques is performed. The tests include different outlier rates, noise levels and occlusions handling.
Abstract-This paper presents a unified approach to solve different bilinear factorization problems in computer vision in the presence of missing data in the measurements. The problem is formulated as a constrained optimization where one of the factors must lie on a specific manifold. To achieve this, we introduce an equivalent reformulation of the bilinear factorization problem that decouples the core bilinear aspect from the manifold specificity. We then tackle the resulting constrained optimization problem via Augmented Lagrange Multipliers. The strength and the novelty of our approach is that this framework can seamlessly handle different computer vision problems. The algorithm is such that only a projector onto the manifold constraint is needed. We present experiments and results for some popular factorization problems in computer vision such as rigid, non-rigid, and articulated Structure from Motion, photometric stereo, and 2D-3D non-rigid registration.
Abstract. This paper presents a unified approach to solve different bilinear factorization problems in Computer Vision in the presence of missing data in the measurements. The problem is formulated as a constrained optimization problem where one of the factors is constrained to lie on a specific manifold. To achieve this, we introduce an equivalent reformulation of the bilinear factorization problem. This reformulation decouples the core bilinear aspect from the manifold specificity. We then tackle the resulting constrained optimization problem with Bilinear factorization via Augmented Lagrange Multipliers (BALM). The mechanics of our algorithm are such that only a projector onto the manifold constraint is needed. That is the strength and the novelty of our approach: it can handle seamlessly different Computer Vision problems. We present experiments and results for two popular factorization problems: Nonrigid Structure from Motion and Photometric Stereo.
This paper proposes a new algorithm to estimate automatically the number of deformation modes needed to describe a non-rigid object with the well-known low-rank shape model, focusing on the missing data case. The 3D shape is assumed to deform as a linear combination of K rigid shape bases according to time varying coefficients. One of the requirements of this formulation is that the number of bases must be known in advance. Most non-rigid structure from motion (NRSfM) approaches based on this model determine the value of K empirically. Our proposed approach is based on the analysis of the frequency spectra of the x and y coordinates corresponding to the individual image trajectories, which are seen as 1D signals. The frequency content of the 2D trajectories is encoded using the modulus of the Discrete Cosine Transform (DCT) of the signals. Our hypothesis is that the value of K that gives the best prediction of the missing data also provides the best 3D reconstruction. Our proposed approach does not assume any prior knowledge and is independent of the 3D reconstruction algorithm used. We validate our approach with experiments on synthetic and real sequences.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.