We study the problem of selecting a subset of k random variables to observe that will yield the best linear prediction of another variable of interest, given the pairwise correlations between the observation variables and the predictor variable. Under approximation preserving reductions, this problem is also equivalent to the "sparse approximation" problem of approximating signals concisely.We propose and analyze exact and approximation algorithms for several special cases of practical interest. We give an FPTAS when the covariance matrix has constant bandwidth, and exact algorithms when the associated covariance graph, consisting of edges for pairs of variables with non-zero correlation, forms a tree or has a large (known) independent set. Furthermore, we give an exact algorithm when the variables can be embedded into a line such that the covariance decreases exponentially in the distance, and a constant-factor approximation when the variables have no "conditional suppressor variables".Much of our reasoning is based on perturbation results for the R 2 multiple correlation measure, frequently used as a measure for "goodness-of-fit statistics". It lies at the core of our FPTAS, and also allows us to extend exact algorithms to approximation algorithms when the matrix "nearly" falls into one of the above classes. We also use perturbation analysis to prove approximation guarantees for the widely used "Forward Regression" heuristic when the observation variables are nearly independent.