Echo state networks (ESNs) constitute a novel approach to recurrent neural network (RNN) training, with an RNN (the reservoir) being generated randomly, and only a readout being trained using a simple computationally efficient algorithm. ESNs have greatly facilitated the practical application of RNNs, outperforming classical approaches on a number of benchmark tasks. In this paper, we introduce a novel Bayesian approach toward ESNs, the echo state Gaussian process (ESGP). The ESGP combines the merits of ESNs and Gaussian processes to provide a more robust alternative to conventional reservoir computing networks while also offering a measure of confidence on the generated predictions (in the form of a predictive distribution). We exhibit the merits of our approach in a number of applications, considering both benchmark datasets and real-world applications, where we show that our method offers a significant enhancement in the dynamical data modeling capabilities of ESNs. Additionally, we also show that our method is orders of magnitude more computationally efficient compared to existing Gaussian process-based methods for dynamical data modeling, without compromises in the obtained predictive performance.
Hidden Markov (chain) models using finite Gaussian mixture models as their hidden state distributions have been successfully applied in sequential data modeling and classification applications. Nevertheless, Gaussian mixture models are well known to be highly intolerant to the presence of untypical data within the fitting data sets used for their estimation. Finite Student's t-mixture models have recently emerged as a heavier-tailed, robust alternative to Gaussian mixture models, overcoming these hurdles. To exploit these merits of Student's t-mixture models in the context of a sequential data modeling setting, we introduce, in this paper, a novel hidden Markov model where the hidden state distributions are considered to be finite mixtures of multivariate Student's t-densities. We derive an algorithm for the model parameters estimation under a maximum likelihood framework, assuming full, diagonal, and factor-analyzed covariance matrices. The advantages of the proposed model over conventional approaches are experimentally demonstrated through a series of sequential data modeling applications.
Factor analysis is a statistical covariance modeling technique based on the assumption of normally distributed data. A mixture of factor analyzers can be hence viewed as a special case of Gaussian (normal) mixture models providing a mathematically sound framework for attribute space dimensionality reduction. A significant shortcoming of mixtures of factor analyzers is the vulnerability of normal distributions to outliers. Recently, the replacement of normal distributions with the heavier-tailed Student's-t distributions has been proposed as a way to mitigate these shortcomings and the treatment of the resulting model under an expectation-maximization (EM) algorithm framework has been conducted. In this paper we develop a Bayesian approach to factor analysis modelling based on Student'st distributions. We derive a tractable variational inference algorithm for this model by expressing the Student's-t distributed factor analyzers as a marginalization over additional latent variables. Our innovative approach provides an efficient and more robust alternative to EM-based methods, resolving their singularity and overfitting proneness problems, while allowing for the automatic determination of the optimal model size. We demonstrate the superiority of the proposed model over well-known covariance modeling techniques in a wide range of signal processing applications.
In this work we consider the problem of fault localization in transparent optical networks. We attempt to localize single-link failures by utilizing statistical machine learning techniques trained on data that describe the network state upon current and past failure incidents. In particular, a Gaussian process classifier is trained on historical data extracted from the examined network, with the goal of modeling and predicting the failure probability of each link therein. To limit the set of suspect links for every failure incident, the proposed approach is complemented by the utilization of a graph-based correlation heuristic. The proposed approach is tested on a number of datasets generated for an orthogonal frequency-division multiplexing-based optical network, and demonstrates that the approach achieves a high localization accuracy (91%-99%) that is insignificantly affected as the size of the historical dataset is reduced. The approach is also compared to a conventional fault localization method that is based on the utilization of monitoring information. It is shown that the conventional method significantly increases the network cost, as measured by the number of monitoring nodes required to achieve the same accuracy as that achieved by the proposed approach. The proposed scheme can be used by service providers to reduce the network cost related to the fault localization procedure. As the approach is generic and does not depend on specific network technologies, it can be applied to different network types, e.g., fixed-grid or space-division multiplexing elastic optical networks.
Hidden Markov random field (HMRF) models are widely used for image segmentation, as they appear naturally in problems where a spatially constrained clustering scheme is asked for. A major limitation of HMRF models concerns the automatic selection of the proper number of their states, i.e., the number of region clusters derived by the image segmentation procedure. Existing methods, including likelihood- or entropy-based criteria, and reversible Markov chain Monte Carlo methods, usually tend to yield noisy model size estimates while imposing heavy computational requirements. Recently, Dirichlet process (DP, infinite) mixture models have emerged in the cornerstone of nonparametric Bayesian statistics as promising candidates for clustering applications where the number of clusters is unknown a priori; infinite mixture models based on the original DP or spatially constrained variants of it have been applied in unsupervised image segmentation applications showing promising results. Under this motivation, to resolve the aforementioned issues of HMRF models, in this paper, we introduce a nonparametric Bayesian formulation for the HMRF model, the infinite HMRF model, formulated on the basis of a joint Dirichlet process mixture (DPM) and Markov random field (MRF) construction. We derive an efficient variational Bayesian inference algorithm for the proposed model, and we experimentally demonstrate its advantages over competing methodologies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.