Minimization of transformed $$L_1$$ L 1 penalty: theory, difference of convex function algorithm, and robust application in compressed sensing

Zhang, Shuai; Xin, Jack

doi:10.1007/s10107-018-1236-x

Cited by 106 publications

(78 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The inequality (1.6) implies that L 1 may not perform well for highly coherent matrices, i.e., µ(A) ∼ 1, as x 0 is then at most one, which seldom occurs simultaneously with Ax * = b. Other than the popular L 1 norm, there are a variety of regularization functionals to promote sparsity, such as L p [9,43,23], L 1 -L 2 [44,26], capped L 1 (CL1) [48,37], and transformed L 1 (TL1) [29,46,47]. Most of these models are nonconvex, leading to difficulties in proving exact recovery guarantees and algorithmic convergence, but they tend to give better empirical results compared to the convex L 1 approach.…”

mentioning

confidence: 99%

See 1 more Smart Citation

A Scale-Invariant Approach for Sparse Signal Recovery

Rahimi¹,

Wang²,

Dong³

et al. 2019

SIAM J. Sci. Comput.

View full text Add to dashboard Cite

In this paper, we study the ratio of the L 1 and L 2 norms, denoted as L 1 /L 2 , to promote sparsity. Due to the non-convexity and non-linearity, there has been little attention to this scale-invariant model. Compared to popular models in the literature such as the Lp model for p ∈ (0, 1) and the transformed L 1 (TL1), this ratio model is parameter free. Theoretically, we present a strong null space property (sNSP) and prove that any sparse vector is a local minimizer of the L 1 /L 2 model provided with this sNSP condition. Computationally, we focus on a constrained formulation that can be solved via the alternating direction method of multipliers (ADMM). Experiments show that the proposed approach is comparable to the state-of-the-art methods in sparse recovery. In addition, a variant of the L 1 /L 2 model to apply on the gradient is also discussed with a proof-of-concept example of the MRI reconstruction.

show abstract

mentioning

confidence: 99%

“…For example, it was reported in [44,26] that L p gives superior results for incoherent matrices (i.e., µ(A) is small), while L 1 -L 2 is the best for the coherent scenario. In addition, TL1 is always the second best no matter whether the matrix is coherent or not [46,47].…”

mentioning

confidence: 99%

A Scale-Invariant Approach for Sparse Signal Recovery

Rahimi¹,

Wang²,

Dong³

et al. 2019

SIAM J. Sci. Comput.

View full text Add to dashboard Cite

show abstract

“…Obviously, when a → 0, i min(|x i |, a)/a → x 0 . Transformed 1 , which is a smooth version of capped 1 , is discussed in the works [39][40][41]. Some other non-convex metrics with concise form are also considered as alternatives to improve 1 , including p with p ∈ (0, 1) [33][34][35], whose formula is…”

Section: Non-convex Regularization Functionmentioning

confidence: 99%

“…Although 1 enjoys several good properties, it is sensitive to outliers and may cause serious bias in estimation [26,27]. To overcome this defect, many non-convex surrogates are proposed and analyzed, including smoothly clipped absolute deviation (SCAD) [26], log penalty [28,29], capped 1 [30,31], minimax concave penalty (MCP) [32], p penalty with p ∈ (0, 1) [33][34][35], the difference of 1 and 2 norms [36][37][38] and transformed 1 [39][40][41]. More and more works have shown the good performance of non-convex regularizers in both theoretical analyses and practical applications.…”

mentioning

confidence: 99%

Transformedℓ1regularization for learning sparse deep neural networks

Miao

Niu

et al. 2019

Neural Networks

View full text Add to dashboard Cite

“…Since lim a→0 + ρ a (x i ) = 1 {xi =0} , lim a→+∞ ρ a (x i ) = |x i |, ∀i, the T 1 penalty interpolates 1 and 0 . For its sparsification in compressed sensing and other applications, see [14] and references therein. To sparsify weights in GSRNN training via 1 and T 1 , we add them to the loss function of GSRNN with a multiplicative penalty parameter α > 0, and call stochastic gradient descent optimizer on Tensorflow.…”

Section: Sparsity Promoting Penaltiesmentioning

confidence: 99%