Sophisticated finite element (FE) model are usually critical for the research and simulation of vehicle dynamics, especially for train crash cases. However, factors such as the complexity of the meshes, the distortion problems involved in a large deformation, etc. would undermine the calculation efficiency of a FE model. Its alternative, a multi-body (MB) model, shows a satisfying time efficiency though, it only presents a limited simulation accuracy when involving highly nonlinear characteristics in a dynamic process. To maintain the advantages of both the two methods, this paper proposes a data-driven simulation framework to model the dynamic behaviours of railway vehicles. In this framework, by extracting the training data of FE simulation using machine learning techniques, the nonlinear characteristics of structures are formulated into a surrogate element to replace the original mechanical elements, then the dynamics simulation is accomplished by co-simulation via embedding the surrogate element into the MB model. This framework consists of a series of techniques including the data collection and feature extraction, the training data sampling, the surrogate element building, and the model evaluation and selection. To verify the accessibility of this framework, two case studies of a vertical and a longitudinal vehicle dynamics simulation are carried out based on Simulink/Simpack co-simulation. By comparing two data-driven models (the Legendre polynomial regression (LPR) and the Kriging), the result shows that using the LPR model in building surrogate elements can largely cut down the simulation time without much compromising of the accuracy.