Scattering kernel models for gas–solid interaction are crucial for rarefied gas flows and microscale flows. However, most existing models depend on certain accommodation coefficients (ACs). We propose here to construct a data-based model using molecular dynamics (MD) simulation and machine learning. The gas–solid interaction is first modelled by 100 000 MD simulations of a single gas molecule reflecting on the wall surface, which is fulfilled by GPU parallel technology. The results showed a correlation of the reflection velocity with the incidence velocity in the same direction, and also revealed correlations that may exist in different directions, which are neglected by the traditional gas–solid interaction model. Inspired by the sophisticated Cercignani–Lampis–Lord (CLL) model, two improved scattering kernels were constructed to better reproduce the probability density of velocity determined from MD simulation. The first one adopts variable ACs which depend on the incidence velocity and the second one combines three CLL-like kernels. All the parameters in the improved kernels are automatically chosen by the machine learning method. Compared with the numerical experiments of a molecular beam, the reconstructed scattering kernels are basically consistent with the MD results.