This paper presents the application of a multiple number of statistical methods and machine learning techniques to model the relationship between rice yield and climate variables of a major region in Sri Lanka, which contributes significantly to the country’s paddy harvest. Rainfall, temperature (minimum and maximum), evaporation, average wind speed (morning and evening), and sunshine hours are the climatic factors considered for modeling. Rice harvest and yield data over the last three decades and monthly climatic data were used to develop the prediction model by applying artificial neural networks (ANNs), support vector machine regression (SVMR), multiple linear regression (MLR), Gaussian process regression (GPR), power regression (PR), and robust regression (RR). The performance of each model was assessed in terms of the mean squared error (MSE), correlation coefficient (R), mean absolute percentage error (MAPE), root mean squared error ratio (RSR), BIAS value, and the Nash number, and it was found that the GPR-based model is the most accurate among them. Climate data collected until early 2019 (Maha season of year 2018) were used to develop the model, and an independent validation was performed by applying data of the Yala season of year 2019. The developed model can be used to forecast the future rice yield with very high accuracy.
This paper presents the development of crop-weather models for the paddy yield in Sri Lanka based on nine weather indices, namely, rainfall, relative humidity (minimum and maximum), temperature (minimum and maximum), wind speed (morning and evening), evaporation, and sunshine hours. The statistics of seven geographical regions, which contribute to about two-thirds of the country’s total paddy production, were used for this study. The significance of the weather indices on the paddy yield was explored by employing Random Forest (RF) and the variable importance of each of them was determined. Pearson’s correlation and Spearman’s correlation were used to identify the behavior of correlation in a positive or negative direction. Further, the pairwise correlation among the weather indices was examined. The results indicate that the minimum relative humidity and the maximum temperature during the paddy cultivation period are the most influential weather indices. Moreover, RF was used to develop a paddy yield prediction model and four more techniques, namely, Power Regression (PR), Multiple Linear Regression (MLR) with stepwise selection, forward (step-up) selection, and backward (step-down) elimination, were used to benchmark the performance of the machine learning technique. Their performances were compared in terms of the Root Mean Squared Error (RMSE), Correlation Coefficient (R), Mean Absolute Error (MAE), and the Mean Absolute Percentage Error (MAPE). As per the results, RF is a reliable and accurate model for the prediction of paddy yield in Sri Lanka, demonstrating a very high R of 0.99 and the least MAPE of 1.4%.
This paper presents the development of models for the prediction of power generation at the Samanalawewa hydropower plant, which is one of the major power stations in Sri Lanka. Four regression-based machine learning and statistical techniques were applied to develop the prediction models. Rainfall data at six locations in the catchment area of the Samanalawewa reservoir from 1993 to 2019 were used as the main input variables. The minimum and maximum temperature and evaporation at the reservoir site were also incorporated. The collinearities between the variables were investigated in terms of Pearson’s and Spearman’s correlation coefficients. It was found that rainfall at one location is less impactful on power generation, while that at other locations are highly correlated with each other. Prediction models based on monthly and quarterly data were developed, and their performance was evaluated in terms of the correlation coefficient (R), mean absolute percentage error (MAPE), ratio of the root mean square error (RMSE) to the standard deviation of measured data (RSR), BIAS, and the Nash number. Of the Gaussian process regression (GPR), support vector regression (SVR), multiple linear regression (MLR), and power regression (PR), the machine learning techniques (GPR and SVR) produced the comparably accurate prediction models. Being the most accurate prediction model, the GPR produced the best correlation coefficient closer to 1 with a very less error. This model could be used in predicting the hydropower generation at the Samanalawewa power station using the rainfall forecast.
This paper presents the development of wind power prediction models for a wind farm in Sri Lanka using an artificial neural network (ANN), multiple linear regression (MLR), and power regression (PR) techniques. Power generation data over five years since 2015 were used as the dependent variable in modeling, while the corresponding wind speed and ambient temperature values were used as independent variables. Variation of these three variables over time was analyzed to identify monthly, seasonal, and annual patterns. The monthly patterns are coherent with the seasonal monsoon winds exhibiting little annual variation, in the absence of extreme meteorological changes during the period of 2015–2020. The correlation within each pair of variables was also examined by applying statistical techniques, which are presented in terms of Pearson’s and Spearman’s correlation coefficients. The impact of unit increase (or decrease) in the wind speed and ambient temperature around their mean values on the output power was also quantified. Finally, the accuracy of each model was evaluated by means of the correlation coefficient, root mean squared error (RMSE), bias, and the Nash number. All the models demonstrated acceptable accuracy with correlation coefficient and Nash number closer to 1, very low RMSE, and bias closer to 0. Although the ANN-based model is the most accurate due to advanced features in machine learning, it does not express the generated power output in terms of the independent variables. In contrast, the regression-based statistical models of MLR and PR are advantageous, providing an insight into modeling the power generated by the other wind farms in the same region, which are influenced by similar climate conditions.
Highlights• This paper focuses on developing wind energy prediction models.• Machine learning and Statistical techniques are applied in the study.• High accuracy of the proposed models is shown in terms of statistical measures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.