We address the regression problem with a new form of data that arises from data privacy applications. Instead of point values, the observed explanatory variables are subsets containing each individual's original value. The classical regression analyses such as least squares are not applicable since the set-valued predictors only carry partial information about the original values. We propose a computationally efficient subset least squares method to perform regression for such data. We establish upper bounds of the prediction loss and risk in terms of the subset structure, the model structure, and the data dimension.The error rates are shown to be optimal under some common situations. Furthermore, we develop a model selection method to identify the most appropriate model for prediction.Experiment results on both simulated and real-world datasets demonstrate the promising performance of the proposed method.