Common high-dimensional methods for prediction rely on having either a sparse signal model, a model in which most parameters are zero and there are a small number of non-zero parameters that are large in magnitude, or a dense signal model, a model with no large parameters and very many small non-zero parameters. We consider a generalization of these two basic models, termed here a "sparse+dense" model, in which the signal is given by the sum of a sparse signal and a dense signal. Such a structure poses problems for traditional sparse estimators, such as the lasso, and for traditional dense estimation methods, such as ridge estimation. We propose a new penalization-based method, called lava, which is computationally efficient. With suitable choices of penalty parameters, the proposed method strictly dominates both lasso and ridge. We derive analytic expressions for the finite-sample risk function of the lava estimator in the Gaussian sequence model. We also provide a deviation bound for the prediction risk in the Gaussian regression model with fixed design. In both cases, we provide Stein's unbiased estimator for lava's prediction risk. A simulation example compares the performance of lava to lasso, ridge, and elastic net in a regression example using data-dependent penalty parameters and illustrates lava's improved performance relative to these benchmarks.1. Introduction. Many recently proposed high-dimensional modeling techniques build upon the fundamental assumption of sparsity. Under sparsity, we can approximate a high-dimensional signal or parameter by a sparse vector that has a relatively small number of non-zero components. Various 1 -based penalization methods, such as the lasso and soft-thresholding, have been proposed for signal recovery, prediction, and parameter estimation within a sparse signal framwork. [29], and others. By virtue of being based on 1 -penalized optimization problems, these methods produce sparse solutions in which many estimated model parameters are set exactly to zero.Another commonly used shrinkage method is ridge estimation. Ridge estimation differs from the aforementioned 1 -penalized approaches in that it does not produce a sparse solution but instead provides a solution in which all model parameters are estimated to be non-zero. Ridge estimation is thus suitable when the model's parameters or unknown signals contain many very small components, i.e. when the model is dense. See, e.g., [25].