In modern multivariate statistics, where high-dimensional datasets are ubiquitous, learning large (inverse-) covariance matrices is imperative for data analysis. A popular approach to estimating a large inverse-covariance matrix is to regularize the Gaussian log-likelihood function by imposing a convex penalty function. In a seminal article, Friedman, Hastie, and Tibshirani (2008, Biostatistics 9: 432–441) proposed a graphical lasso (Glasso) algorithm to efficiently estimate sparse inverse-covariance matrices from the convex regularized log-likelihood function. In this article, I first explore the Glasso algorithm and then introduce a new graphiclasso command for the large inverse-covariance matrix estimation. Moreover, I provide a useful command for tuning parameter selection in the Glasso algorithm using the extended Bayesian information criterion, the Akaike information criterion, and cross-validation. I demonstrate the use of Glasso using simulation results and real-world data analysis.
We establish a novel framework for learning a directed acyclic graph (DAG) when data are generated from a Gaussian, linear structural equation model. It consists of two parts: (1) introduce a permutation matrix as a new parameter within a regularized Gaussian log-likelihood to represent variable ordering; and (2) given the ordering, estimate the DAG structure through sparse Cholesky factor of the inverse covariance matrix. For permutation matrix estimation, we propose a relaxation technique that avoids the NP-hard combinatorial problem of order estimation. Given an ordering, a sparse Cholesky factor is estimated using a cyclic coordinatewise descent algorithm which decouples row-wise. Our framework recovers DAGs without the need for an expensive verification of the acyclicity constraint or enumeration of possible parent sets. We establish numerical convergence of the algorithm, and consistency of the Cholesky factor estimator when the order of variables is known. Through several simulated and macro-economic datasets, we study the scope and performance of the proposed methodology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.