This paper introduces an L1-norm-based probabilistic principal component analysis model on 2D data (L1-2DPPCA) based on the assumption of the Laplacian noise model. The Laplacian or L1 density function can be expressed as a superposition of an infinite number of Gaussian distributions. Under this expression, a Bayesian inference can be established based on the variational expectation maximization approach. All the key parameters in the probabilistic model can be learned by the proposed variational algorithm. It has experimentally been demonstrated that the newly introduced hidden variables in the superposition can serve as an effective indicator for data outliers. Experiments on some publicly available databases show that the performance of L1-2DPPCA has largely been improved after identifying and removing sample outliers, resulting in more accurate image reconstruction than the existing PCA-based methods. The performance of feature extraction of the proposed method generally outperforms other existing algorithms in terms of reconstruction errors and classification accuracy.
Abstract-Dimensionality reduction for high-order tensors is a challenging problem. In conventional approaches, higher order tensors are "vectorized" via Tucker decomposition to obtain lower order tensors. This will destroy the inherent high-order structures or resulting in undesired tensors, respectively. This paper introduces a probabilistic vectorial dimensionality reduction model for tensorial data. The model represents a tensor by employing a linear combination of same order basis tensors, thus it offers a mechanism to directly reduce a tensor to a vector. Under this expression, the projection base of the model is based on the tensor CandeComp/PARAFAC (CP) decomposition and the number of free parameters in the model only grows linearly with the number of modes rather than exponentially. A Bayesian inference has been established via the variational EM approach. A criterion to set the parameters (factor number of CP decomposition and the number of extracted features) is empirically given. The model outperforms several existing PCA-based methods and CP decomposition on several publicly available databases in terms of classification and clustering accuracy.
Restricted Boltzmann machine (RBM) is a famous model for feature extraction and can be used as an initializer for neural networks. When applying the classic RBM to multidimensional data such as 2D/3D tensors, one needs to vectorize such as high-order data. Vectorizing will result in dimensional disaster and valuable spatial information loss. As RBM is a model with fully connected layers, it requires a large amount of memory. Therefore, it is difficult to use RBM with high-order data on low-end devices. In this article, to utilize classic RBM on tensorial data directly, we propose a new tensorial RBM model parameterized by the tensor train format (TTRBM). In this model, both visible and hidden variables are in tensorial form, which are connected by a parameter matrix in tensor train format. The biggest advantage of the proposed model is that TTRBM can obtain comparable performance compared with the classic RBM with much fewer model parameters and faster training process. To demonstrate the advantages of TTRBM, we conduct three real-world applications, face reconstruction, handwritten digit recognition, and image super-resolution in the experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.