Mean-variance optimization (MVO) is known to be highly sensitive to estimation error in its inputs. Recently, norm penalization of MVO programs has proven to be an effective regularization technique that can help mitigate the adverse effects of estimation error. In this paper, we augment the standard MVO program with a convex combination of parameterized L 1 and L 2 norm penalty functions. The resulting program is a parameterized penalized quadratic program (PPQP) whose primal and dual form are shown to be constrained quadratic programs (QPs). We make use of recent advances in neural-network architecture for differentiable QPs and present a novel, datadriven stochastic optimization framework for optimizing parameterized regularization structures in the context of the final decision-based MVO problem. The framework is highly flexible and capable of jointly optimizing both prediction and regularization model parameters in a fully integrated manner. We provide several historical simulations using global futures data and highlight the benefits and flexibility of the stochastic optimization approach.