Fossil fuels, including oil, are the most important sources of energy. They are commonly used in various forms of commercial and industrial consumption. Producing oil is a complex task that requires special management and planning. This can result in a serious problem if the oil well is not operated properly. Oil engineers must have the necessary knowledge about the well's status to perform their duties properly. This study proposes a linear regression method to predicate the oil production value. It takes into account various independent variables, such as the pressure, downhole temperature, and pressure tubing. The proposed method can accurately reach a very close prediction of the actual production value by achieving very interesting results at the end of this study.
Having sudden strokes has had a very negative impact on all aspects in society to the point that it attracted efforts for better improvement and management of stroke diagnosis. Technological advancement also had an impact on the medical field such that nowadays caregivers have better options for taking care of their patients by mining and archiving their medical records for ease of retrieval. Furthermore, it is quite essential to understand the risk factors that make a patient more susceptible to strokes, thus there are some factors that make stroke prediction much easier. This research offers an analysis of the factors that enhance the stroke prediction process based on electronic health records. The most important factors for stroke prediction will be identified using statistical methods and Principal Component Analysis (PCA). It has been found that the most critical factors affecting stroke prediction are the age, average glucose level, heart disease, and hypertension. A balanced dataset is used for the model evaluation which was created by sub-sampling since the dataset for stroke occurrence is already highly imbalanced. In this study, seven different machine learning algorithms are implemented: Naïve Bayes, SVM, Random Forest, KNN, Decision Tree, Stacking, and majority voting to train on the Kaggle dataset to predict occurrence of stroke in patients. After preprocessing and splitting the dataset into training and testing sub-datasets, these proposed algorithms were evaluated according to accuracy, f1 score, recall value, and precision value. The NB classifier achieved the lowest accuracy level (86%), whereas the rest of the algorithms achieved similar accuracies 96%, f1 scores 0.98, precision 0.97, and recall 1.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.