In this work, we conduct an analytical review of contemporary international approaches to forecasting the volume of electricity generated by renewable energy sources, as well as to investigate current problems and prospective solutions in this field. The existing forecasting methods were classified following an analysis of published literature on the development of forecasting models, including those based on physical, statistical and machine learning principles. The application practice of these methods was investigated to determine the advantages and disadvantages of each method. In the majority of cases, particularly when carrying out short-term forecasting of renewable electricity generation, machine learning methods outperform physical and statistical methods. An analysis of the current problems in the field of weather data collection systems allowed the major obstacles to a wide application of machine learning algorithms to be determined, which comprise incompleteness and uncertainty of input data, as well as the high computational complexity of such algorithms. An increased efficiency of machine learning models in the task of forecasting renewable energy generation can be achieved using data preprocessing methods, such as normalization, anomaly detection, missing value recovery, augmentation, clustering and correlation analysis. The need to develop data preprocessing methods aimed at optimizing and improving the overall efficiency of machine learning models for forecasting renewable energy generation was justified. Research in this direction, while taking into account the above problems, is highly relevant for the imp lementation of programs for the integration of renewable energy sources into power systems and the development of carbon-free energy.
This study aims to improve the accuracy of forecasting the electricity consumption of an enterprise based on an analysis and preliminary processing of input data, as well as at evaluating the effect caused by feature selection on the results of various forecast models. A woodworking enterprise located in Nizhniy Novgorod was selected as a forecast object. Two types of machine learning methods, including neural network and ensemble models, were compared. An approach to selecting the most significant parameters (features) from a time series was considered in order to improve the results of the following ensemble models based on decision trees: adaptive busting (AdaBoost), Gradient Boosting and Random Forest. The most significant features of the initial time series were determined using the calculation of correlation coefficients between the values of electricity consumption in forecasted and previous hours. For the considered forecast object, the most significant features were established to be the consumed energy in hours lagging behind the forecasted hour by the multiple number of days. The schedule of repair works for woodworking machines was used as an additional feature. According to the obtained results, decision tree ensembles can surpass artificial neural networks provided that significant features are selected correctly. Thus, the smallest average error of a neural network model on a test sample comprised 7.0%, while an error of 5.5% was obtained for a Gradient Boosting ensemble model. The use of a repair schedule was demonstrated to additionally increase the forecast accuracy: for the considered ensemble models, the error reduced from 20 to 30%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.