Expectation Propagation for Bayesian Multi-task Feature Selection

Hernández-Lobato, Daniel; Hernández-Lobato, José Miguel; Helleputte, Thibault; Dupont, Pierre

doi:10.1007/978-3-642-15880-3_39

Cited by 32 publications

(43 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Also we propose two hierarchical sparse models using Spike and Slab priors and relate them to HiLasso and C-HiLasso. Prior works using Spike and Slab priors for multi-task learning problems [15,16] only consider the block sparsity among different tasks while our work consider both the block sparsity across tasks and the group sparsity inside each task.…”

Section: Relation To Prior Workmentioning

confidence: 99%

“…Different techniques can be used such as sampling methods or approximation methods. We choose expectation propagation (EP) because of its efficiency and demonstrated success for multi-task learning problems [16].…”

Section: Inferencementioning

confidence: 99%

“…Although Spike and Slab priors have been used in multitask sparse modeling [15,16], only block sparsity (correlation between different tasks) has been considered. In this section, we will provide an extension of collaborative hierarchical sparse modeling using Spike and Slab priors (CHi-BCS).…”

Section: Collaborative Hierarchical Extension -Chi-bcsmentioning

confidence: 99%

See 2 more Smart Citations

Hierarchical sparse modeling using Spike and Slab priors

Suo

Dao

Tran

et al. 2013

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

View full text Add to dashboard Cite

Sparse modeling has demonstrated its superior performances in many applications. Compared to optimization based approaches, Bayesian sparse modeling generally provides a more sparse result with a knowledge of confidence. Using the Spike and Slab priors, we propose the hierarchical sparse models for the scenario of single task and multitask -Hi-BCS and CHi-BCS. We draw the connections of these two methods to their optimization based counterparts and use expectation propagation for inference. The experiment results using synthetic and real data demonstrate that the performance of Hi-BCS and Chi-BCS are comparable or better than their optimization based counterparts.

show abstract

Section: Relation To Prior Workmentioning

confidence: 99%

Section: Inferencementioning

confidence: 99%

See 1 more Smart Citation

Hierarchical sparse modeling using Spike and Slab priors

Suo

Dao

Tran

et al. 2013

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

View full text Add to dashboard Cite

show abstract

“…Since γ and β are given, the VG algorithm reduces to iterate eqs. (8) and (9) starting from a random m. Similarly, the PMF reduces to perform an E-step given the fixed hyperparameter values.…”

Section: Boston-housing Dataset: Vg Vs Pmfmentioning

confidence: 99%

The Variational Garrote

Kappen

Gómez

2013

Mach Learn

View full text Add to dashboard Cite

In this paper, we present a new variational method for sparse regression using L 0 regularization. The variational parameters appear in the approximate model in a way that is similar to Breiman's Garrote model. We refer to this method as the variational Garrote (VG). We show that the combination of the variational approximation and L 0 regularization has the effect of making the problem effectively of maximal rank even when the number of samples is small compared to the number of variables. The VG is compared numerically with the Lasso method, ridge regression and the recently introduced paired mean field method (PMF) [1]. Numerical results show that the VG and PMF yield more accurate predictions and more accurately reconstruct the true model than the other methods. It is shown that the VG finds correct solutions when the Lasso solution is inconsistent due to large input correlations. Globally, VG is significantly faster than PMF and tends to perform better as the problems become denser and in problems with strongly correlated inputs. The naive implementation of the VG scales cubic with the number of features. By introducing Lagrange multipliers we obtain a dual formulation of the problem that scales cubic in the number of samples, but close to linear in the number of features.

show abstract

“…Traditional studies are based on a strict assumption that selected variables are shared among all tasks [22,27]. Recent studies have suggested a more flexible approach that involves selecting variables by decomposing a coefficient into a shared part and an individual part [12,15] or factorizing a coefficient using a variable specific part and a task-variable part [32]. Although the variable selection approach provides better interpretability than the other approaches, it has limited ability to share common information among related tasks.…”

Section: Introductionmentioning

confidence: 99%

Variable Selection and Task Grouping for Multi-Task Learning

Jeong

2018

Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining

View full text Add to dashboard Cite

We consider multi-task learning, which simultaneously learns related prediction tasks, to improve generalization performance. We factorize a coefficient matrix as the product of two matrices based on a low-rank assumption. These matrices have sparsities to simultaneously perform variable selection and learn and overlapping group structure among the tasks. The resulting bi-convex objective function is minimized by alternating optimization, where sub-problems are solved using alternating direction method of multipliers and accelerated proximal gradient descent. Moreover, we provide the performance bound of the proposed method. The effectiveness of the proposed method is validated for both synthetic and real-world datasets.

show abstract

Expectation Propagation for Bayesian Multi-task Feature Selection

Cited by 32 publications

References 14 publications

Hierarchical sparse modeling using Spike and Slab priors

Hierarchical sparse modeling using Spike and Slab priors

The Variational Garrote

Variable Selection and Task Grouping for Multi-Task Learning

Contact Info

Product

Resources

About