Sparse learning is central to high-dimensional data analysis, and various methods have been developed. Ideally, a sparse learning method shall be methodologically flexible, computationally efficient, and with theoretical guarantee, yet most existing methods need to compromise some of these properties to attain the other ones. In this article, a three-step sparse learning method is developed, involving kernel-based estimation of the regression function and its gradient functions as well as a hard thresholding. Its key advantage is that it assumes no explicit model assumption, admits general predictor effects, allows for efficient computation, and attains desirable asymptotic sparsistency. The proposed method can be adapted to any reproducing kernel Hilbert space (RKHS) with different kernel functions, and its computational cost is only linear in the data dimension. The asymptotic sparsistency of the proposed method is established for general RKHS under mild conditions. Numerical experiments also support that the proposed method compares favorably against its competitors in both simulated and real examples.