With the availability of data and computational technologies in the modern world, machine learning (ML) has emerged as a preferred methodology for data analysis and prediction. While ML holds great promise, the results from such models are not fully unreliable due to the challenges introduced by uncertainty. An ML model generates an optimal solution based on its training data. However, if the uncertainty in the data and the model parameters are not considered, such optimal solutions have a high risk of failure in actual world deployment. This paper surveys the different approaches used in ML to quantify uncertainty. The paper also exhibits the implications of quantifying uncertainty when using ML by performing two case studies with space physics in focus. The first case study consists of the classification of auroral images in predefined labels. In the second case study, the horizontal component of the perturbed magnetic field measured at the Earth’s surface was predicted for the study of Geomagnetically Induced Currents (GICs) by training the model using time series data. In both cases, a Bayesian Neural Network (BNN) was trained to generate predictions, along with epistemic and aleatoric uncertainties. Finally, the pros and cons of both Gaussian Process Regression (GPR) models and Bayesian Deep Learning (DL) are weighed. The paper also provides recommendations for the models that need exploration, focusing on space weather prediction.
Farming in Bangladesh is mostly done manually. The automated way of farming here is still not introduced. This research is trying to apply a fundamental approach to inaugurate the automated process in farming in our country. It is an automated farming system designed in android application, which has been implemented to choose the best crop before starting the cultivation process according to the area of the cultivating land. Here, the best crop signifies the crop which will be the most cost effective for that particular land. In this case, the six major crops of Bangladesh -Aus, Aman, Boro, Potato, Wheat and Jute will be considered.This system is also able to prepare a schedule of total cultivation process e.g. the correct time of fertilization and irrigation according to the kind of crop types. The total system is focused on the climate and geographical condition of different areas of Bangladesh. It predicts the best cost effective crop using a prediction based algorithm. The algorithm are aimed to use is multiple linear regression with the association of some independent variables i.e. rainfall, average maximum temperature and average minimum temperature of certain location and give prediction based on yield rate per unit area. Later, KNNR algorithm was used to compare the accuracy and error rate of the predicted yield rate. To describe the functionality of this system; at first, farmer gives the perimeter of land in input area and the district from dropdown menu if he wants the suggestion of best crop. Then best crop name will be shown in the screen. If the suggestive crop is chosen, the entire steps of cultivation will be shown to him. Then the notification of irrigation, fertilization will be shown up timely or in a calendar form. The crop zone is divided according to the division and districts. The data of crops of total seven regions -Bogra, Comilla, Dinajpur, Sylhet, Dhaka, Barisal, Faridpur, Khulna, Rajshahi and Rangpur will be stored in database system. The dataset consists of information on six major crops of Bangladesh; their yield rate, maximum temperature, minimum temperature, year range, region and rainfall. The past twelve years (2000-2011) of Bangladesh have been considered making this dataset to ensure learning and training of the algorithm and increasing the accuracy rate of the prediction and for testing we used three years (2012-2014) for computing accuracy.
Geomagnetically Induced Currents are one of the most hazardous effects caused by geomagnetic storms. In the past literature, the variations in ground magnetic fields over time, dB/dt were used as a proxy value for GIC. Machine Learning (ML) techniques have emerged as a preferred methodology to predict dB/dt. However, space weather data are highly dynamic in nature, and the data distribution is subject to change over time due to environmental variability. The ML models developed are prone to the uncertainty in the input data and therefore suffer from high variance. In addition, a part of an ML architecture performance is conditional on the variables used to model the system in focus. Therefore, a single algorithm may not generate the required accuracy for a given dataset. In this work, a Bayesian Ensemble ML model has been developed to predict the variations over time of the local ground magnetic horizontal component, dBH/dt. The Ensemble methodology combines multiple ML models in the prediction process to predict dBH/dt. Bayesian statistics allow the estimation of model parameters and output as probability distributions, where the variance quantifies the uncertainty. The input data consists of solar-wind data from OmniWeb for the years 2001–2010. The local ground horizontal magnetic components for the corresponding time were calculated using SuperMAG data for the Ottawa ground magnetometer station for the years mentioned above. The years 2011–2015 were selected for model testing, as it encompasses the 5 August 2011 and 17 March 2015 geomagnetic storms. Five different accuracy metrics were considered; namely, Root Mean Squared Error (RMSE), Probability of Detection (POD), Probability of False Detection (PFD), Proportion Correct (PC), and Heidke Skills Score (HSS). The parameter uncertainty of the models is quantified, and the mean predicted dBH/dt is generated with a 95% credible interval. It can be observed that different models perform better with different datasets and the ensemble model has an accuracy comparable to the models with a relatively strong performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.