Method
This repository provides an implementation of the MGLasso
(Multiscale Graphical Lasso) algorithm: a novel approach for estimating sparse Gaussian Graphical Models with the addition of a group-fused Lasso penalty.
MGLasso
is described in the paper Inference of Multiscale Gaussian Graphical Model. MGLasso
has three major contributions:
We simultaneously infer a network and estimate a clustering structure by combining the neighborhood selection approach (Meinshausen and Bühlman, 2006) and convex clustering (Hocking et al. 2011).
We use a continuation with Nesterov smoothing in a shrinkage-thresholding algorithm (
CONESTA
, Hadj-Selem et al. 2018) to solve the optimization problem .We show numerically that
MGLasso
performs better thanGLasso
(Meinshausen and Bühlman, 2006; Friedman et al. 2007) in terms of support recovery in a block diagonal model with few sparsity inside blocks. The clustering performances ofMGLasso
can be improved by the addition of a weight term to calibrate the variable fusion regularizer.
To solve the MGLasso
problem, we seek the regression vectors that minimize
MGLasso
package available on CRAN is based on the python implementation of the solver CONESTA
available in pylearn-parsimony library.
Package requirements
- Install the
reticulate
package.
install.packages('reticulate')
- Check if Python engine is available on the system
reticulate::py_available()
- If not available
reticulate::install_miniconda()
- Install
MGLasso
install.packages('mglasso')
- Install
MGLasso
python dependencies
mglasso::install_conesta()
An example of use is given below.
Illustration on a simple model
Installation
library(mglasso)
#> Welcome to MGLasso.
#> Run install_conesta() to finalize the package python dependencies installation.
install_conesta()
#> mglasso requires the r-reticulate virtual environment. Attempting to create...
#> virtualenv: r-reticulate
#> Using virtual environment 'r-reticulate' ...
#> + '/Users/doedmond.sanou/.virtualenvs/r-reticulate/bin/python' -m pip install --upgrade 'scipy == 1.7.1' 'scikit-learn'
#> Installing pylearn-parsimony
#> pylearn-parsimony is installed.
Simulate a block diagonal model
We simulate a -block diagonal model where each block contains variables. The intra-block correlation level is set to while the correlations outside the blocks are kept to .
library(Matrix)
n = 50
K = 3
p = 9
rho = 0.85
blocs <- list()
for (j in 1:K) {
bloc <- matrix(rho, nrow = p/K, ncol = p/K)
for(i in 1:(p/K)) { bloc[i,i] <- 1 }
blocs[[j]] <- bloc
}
mat.correlation <- Matrix::bdiag(blocs)
corrplot::corrplot(as.matrix(mat.correlation), method = "color", tl.col="black")
Run mglasso()
We set the sparsity level to and rescaled it with the size of the sample.
X <- scale(X)
res <- mglasso(X, lambda1 = 0.2*n, lambda2_start = 0.1, fuse_thresh = 1e-3, verbose = FALSE)
To launch a unique run of the objective function call the conesta
function.
temp <- mglasso::conesta(X, lam1 = 0.2*n, lam2 = 0.1)
Estimated clustering path
We plot the clustering path of mglasso
method on the 2 principal components axis of . The path is drawn on the predicted ’s.
Estimated adjacency matrices along the clustering path
As the the fusion penalty increases from level9
to level1
we observe a progressive fusion of adjacent edges.
plot_mglasso(res)
Reference
Edmond, Sanou; Christophe, Ambroise; Geneviève, Robin; (2022): Inference of Multiscale Gaussian Graphical Model. ArXiv. Preprint. https://doi.org/10.48550/arXiv.2202.05775