To reveal and leverage the correlated and complemental information between different views, a great amount of multi-view learning algorithms have been proposed in recent years. However, unsupervised feature selection in multiview learning is still a challenge due to lack of data labels that could be utilized to select the discriminative features. Moreover, most of the traditional feature selection methods are developed for the single-view data, and are not directly applicable to the multi-view data. Therefore, we propose an unsupervised learning method called Adaptive Unsupervised Multi-view Feature Selection (AUMFS) in this paper. AUMFS attempts to jointly utilize three kinds of vital information, i.e., data cluster structure, data similarity and the correlations between different views, contained in the original data together for feature selection. To achieve this goal, a robust sparse regression model with the l 2,1 -norm penalty is introduced to predict data cluster labels, and at the same time, multiple view-dependent visual similar graphs are constructed to flexibly model the visual similarity in each view. Then, AUMFS integrates data cluster labels prediction and adaptive multi-view visual similar graph learning into a unified framework. To solve the objective function of AUMFS, a simple yet efficient iterative method is proposed. We apply AUMFS to three visual concept recognition applications (i.e., social image concept recognition, object recognition and video-based human action recognition) on four benchmark datasets. Experimental results show the proposed method significantly outperforms several state-of-the-art feature selection methods. More importantly, our method is not very sensitive to the parameters and the optimization method converges very fast.
a b s t r a c tHuman motion denoising is an indispensable step of data preprocessing for many motion data based applications. In this paper, we propose a data-driven based human motion denoising method that sparsely selects the most correlated subset of motion bases for clean motion reconstruction. Meanwhile, it takes the statistic property of two common noises, i.e., Gaussian noise and outliers, into account in deriving the objective functions. In particular, our method firstly divides each human pose into five partitions termed as poselets to gain a much fine-grained pose representation. Then, these poselets are reorganized into multiple overlapped poselet groups using a lagged window moving across the entire motion sequence to preserve the embedded spatial-temporal motion patterns. Afterward, five compacted and representative motion dictionaries are constructed in parallel by means of fast K-SVD in the training phase; they are used to remove the noise and outliers from noisy motion sequences in the testing phase by solving ℓ 1 -minimization problems. Extensive experiments show that our method outperforms its competitors. More importantly, compared with other data-driven based method, our method does not need to specifically choose the training data, it can be more easily applied to real-world applications.
a b s t r a c tHuman motion retrieval plays an important role in many motion data based applications. In the past, many researchers tended to use a single type of visual feature as data representation. Because different visual feature describes different aspects about motion data, and they have dissimilar discriminative power with respect to one particular class of human motion, it led to poor retrieval performance. Thus, it would be beneficial to combine multiple visual features together for motion data representation. In this article, we present an Adaptive Multi-view Feature Selection (AMFS) method for human motion retrieval. Specifically, we first use a local linear regression model to automatically learn multiple view-based Laplacian graphs for preserving the local geometric structure of motion data. Then, these graphs are combined together with a non-negative view-weight vector to exploit the complementary information between different features. Finally, in order to discard the redundant and irrelevant feature components from the original highdimensional feature representation, we formulate the objective function of AMFS as a general trace ratio optimization problem, and design an effective algorithm to solve the corresponding optimization problem. Extensive experiments on two public human motion database, i.e., HDM05 and MSR Action3D, demonstrate the effectiveness of the proposed AMFS over the state-of-art methods for motion data retrieval. The scalability with large motion dataset, and insensitivity with the algorithm parameters, make our method can be widely used in real-world applications.
Missing marker problem is very common in human motion capture. In contrast to most current methods which handle this problem based on trying to learn a reliable predictor from the observations, we consider it from the perspective of sparse representation and propose a novel method which is named l1-sparse representation of missing markers prediction (L1-SRMMP). We assume that the incomplete pose can be represented by a linear combination of a few poses from the training set and the representation is sparse. Therefore, we cast the predicting missing markers as finding a sparse representation of the observable data of the incomplete pose, and then we use it to predict the missing data. In order to get a sparse representation, we employ l1-norm in our objective function. Moreover, we propose presentation coefficient weighted update (PCWU) algorithm to mitigate the limited capacity problem of the training set. Experimental results demonstrate the effectiveness and efficiency of our method to predict the missing markers in human motion capture.
In computer vision and multimedia analysis, it is common to use multiple features (or multimodal features) to represent an object. For example, to well characterize a natural scene image, we typically extract a set of visual features to represent its color, texture, and shape. However, it is challenging to integrate multimodal features optimally. Since they are usually high-order correlated, e.g., the histogram of gradient (HOG), bag of scale invariant feature transform descriptors, and wavelets are closely related because they collaboratively reflect the image texture. Nevertheless, the existing algorithms fail to capture the high-order correlation among multimodal features. To solve this problem, we present a new multimodal feature integration framework. Particularly, we first define a new measure to capture the high-order correlation among the multimodal features, which can be deemed as a direct extension of the previous binary correlation. Therefore, we construct a feature correlation hypergraph (FCH) to model the high-order relations among multimodal features. Finally, a clustering algorithm is performed on FCH to group the original multimodal features into a set of partitions. Moreover, a multiclass boosting strategy is developed to obtain a strong classifier by combining the weak classifiers learned from each partition. The experimental results on seven popular datasets show the effectiveness of our approach.
Motion capture is an important technique with a wide range of applications in areas such as computer vision, computer animation, film production, and medical rehabilitation. Even with the professional motion capture systems, the acquired raw data mostly contain inevitable noises and outliers. To denoise the data, numerous methods have been developed, while this problem still remains a challenge due to the high complexity of human motion and the diversity of real-life situations. In this paper, we propose a data-driven-based robust human motion denoising approach by mining the spatial-temporal patterns and the structural sparsity embedded in motion data. We first replace the regularly used entire pose model with a much fine-grained partlet model as feature representation to exploit the abundant local body part posture and movement similarities. Then, a robust dictionary learning algorithm is proposed to learn multiple compact and representative motion dictionaries from the training data in parallel. Finally, we reformulate the human motion denoising problem as a robust structured sparse coding problem in which both the noise distribution information and the temporal smoothness property of human motion have been jointly taken into account. Compared with several state-of-the-art motion denoising methods on both the synthetic and real noisy motion data, our method consistently yields better performance than its counterparts. The outputs of our approach are much more stable than that of the others. In addition, it is much easier to setup the training dataset of our method than that of the other data-driven-based methods.
With the explosive growth of motion capture data, it becomes very imperative in animation production to have an efficient search engine to retrieve motions from large motion repository. However because of the high dimension of data space and complexity of matching methods, most of existing approaches cannot return the result in real time. This paper proposes a high level semantic feature in a low dimensional space to represent the essential characteristic of different motion classes. Based on the statistic training of Gauss Mixture Model, this feature can effectively achieve motion matching on both global clip level and local frame level. Experiment results show our approach can retrieve similar motions with rankings from large motion database in real-time, and also can make motion annotation automatically on the fly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.