Machine learning and computational intelligence technologies gain more and more popularity as possible solution for issues related to the power grid. One of these issues, the power flow calculation, is an iterative method to compute the voltage magnitudes of the power grid's buses from power values. Machine learning and, especially, artificial neural networks were successfully used as surrogates for the power flow calculation. Artificial neural networks highly rely on the quality and size of the training data, but this aspect of the process is apparently often neglected in the works we found. However, since the availability of high quality historical data for power grids is limited, we propose the Correlation Sampling algorithm. We show that this approach is able to cover a larger area of the sampling space compared to different random sampling algorithms from the literature and a copula-based approach, while at the same time inter-dependencies of the inputs are taken into account, which, from the other algorithms, only the copula-based approach does.The knowledge about the current state of the power grid is, usually, limited to information about the power generation or consumption of the participants of the grid, either through prognosis or by estimations via default load profiles. However, a stable grid operation requires a certain frequency level (50 Hz in Europe) and certain voltage levels. Since only the power values are known, voltage information needs to be calculated, which is done with Power Flow (PF) analysis. The PF analysis is performed many times during the operation of power grids and the results can be used, e. g., for market analysis or short-term operational planning.Since the PF analysis often requires to perform matrix inversion, a task with a high computational burden, there are many approaches to reduce this computation time. Active research for improvements of the more traditional methods can be found, e. g., in [1,2,3,4]. On the other side, the advancement and application of Machine Learning (ML) models for energy systems has also increased in the past two decades. Mosavi et al. [5] reviewed a broad range of such applications, but PF analysis is barely mentioned, possibly due to their selection of relevant papers. [6] gave an overview about works regarding the closely related Optimal Power Flow (OPF) problem. OPF, however, is only a specific use case for PF. Nevertheless, ML for PF analysis is a field of active research and, especially, Artificial Neural Networks (ANNs) are often used with great performance, e. g., in [7], [8], and [9]. However, ANNs require a large amount of data, especially when the ANN becomes a Deep Neural Network (DNN). In our recent works in [10] and [11], we build different ML models, including a DNN, as a surrogate model for a Low Voltage (LV) power grid to avoid the costly PF analysis. One issue we identified concerns the availability and generation