The rapidly evolving landscape of multicore architectures makes the construction of efficient libraries a daunting task. A family of methods known collectively as "auto-tuning" has emerged to address this challenge. Two major approaches to auto-tuning are empirical and model-based: empirical autotuning is a generic but slow approach that works by measuring runtimes of candidate implementations, model-based auto-tuning predicts those runtimes using simplified abstractions designed by hand. We show that machine learning methods for non-linear regression can be used to estimate timing models from data, capturing the best of both approaches. A statistically-derived model offers the speed of a model-based approach, with the generality and simplicity of empirical auto-tuning. We validate our approach using the filterbank correlation kernel described in Pinto and Cox [2012], where we find that 0.1 seconds of hill climbing on the regression model ("predictive auto-tuning") can achieve almost the same speed-up as is brought by minutes of empirical auto-tuning. Our approach is not specific to filterbank correlation, nor even to GPU kernel auto-tuning, and can be applied to almost any templated-code optimization problem, spanning a wide variety of problem types, kernel types, and platforms.