Performance of Machine Learning Algorithms with Different K Values in K-fold CrossValidation

Nti, Isaac Kofi; yarko-Boateng, Owusu N; Aning, Justice

doi:10.5815/ijitcs.2021.06.05

Cited by 52 publications

(34 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…We also used a 10 k‐fold cross‐validation method to further reduce instances of overfitting by testing the predictive accuracy of the training dataset (Leathwick et al, 2006). The 10 k‐fold cross‐validation works by splitting the training data into 10 equal parts and then leaving one part out each time the model is run to test the predictive performance, which can indicate selection bias or overfitting in the model (Nti et al, 2021). The BRT models were run using R statistical software (Team, 2014) and the “gbm” package (Ridgeway, 2007).…”

Section: Methodsmentioning

confidence: 99%

Mapping the impacts of multiple stressors on the decline in kelps along the coast of Victoria, Australia

Young

Critchell

Miller

et al. 2022

Diversity and Distributions

View full text Add to dashboard Cite

Aim: Kelp forests throughout temperate regions of the world serve as foundation species that play a critical role in sustaining the health and function of marine ecosystems but are experiencing declines in abundance due to a loss in resilience as the ocean climate changes. Ocean warming along southeast Australia has already been linked to dramatic losses of kelp species and is contributing to the range expansion and population increases of two species of sea urchin. This research attempts to understand the impact of multiple stressors on the decline in kelps in this region.Location: Coastal Waters of Victoria, Southeast Australia. Methods:In this study, we use long-term (>20 years) datasets on biological observations across Victorian waters to determine trends in coverage and the impact of multiple environmental variables (oceanography, habitat, and urchin abundances) on two important kelps that serve as foundation species (Phyllospora comosa and Ecklonia radiata) using boosted regression trees. These models were then used to develop predictive distribution models for each species and also to project abundance distributions into the future. Results:We found that both kelp species are decreasing in percent coverage over time with multiple environmental variables contributing to these declines, including increasing temperatures, intensifying wave energy, changes in currents and recruitment patterns, and increases in urchin populations. Additionally, future projections of temperature, wave energy, and urchin populations show that both species will continue to decrease across 62%-94% of their range by 2090. Main conclusions:Long-term biological datasets allowed us to develop maps of the past, current, and future distributions of these important foundation species, providing valuable information to managers for prioritization of areas for targeted urchin management and restoration of kelps. Understanding the environmental factors affecting their distribution helps guide manager restoration investments in regions where kelp populations are most likely to persist in the future.

show abstract

Section: Methodsmentioning

confidence: 99%

Mapping the impacts of multiple stressors on the decline in kelps along the coast of Victoria, Australia

Young

Critchell

Miller

et al. 2022

Diversity and Distributions

View full text Add to dashboard Cite

show abstract

“…When performing cross validation, It is typically a standard that the chosen number of folds is equal to 10 ( k = 10). 45 …”

Section: Proposed Methodology and Implementationmentioning

confidence: 99%

“…The hypothesis resulting from such operation would be the final answer. When performing cross validation, It is typically a standard that the chosen number of folds is equal to 10 ( k = 10) …”

Section: Proposed Methodology and Implementationmentioning

confidence: 99%

“…When performing cross validation, It is typically a standard that the chosen number of folds is equal to 10 (k = 10). 45 Hyperparameter tuning is considered one of the important steps while creating any data-driven model to get the best results from the deployed algorithm. Regarding the XGBoost algorithm, hyperparameters are divided into three categories.…”

Section: Extreme Gradient Boosting (Xgboost)mentioning

confidence: 99%

See 1 more Smart Citation

Machine Learning Approach for Predictive Maintenance of the Electrical Submersible Pumps (ESPs)

et al. 2022

View full text Add to dashboard Cite

Electrical submersible pumps (ESPs) are considered the second-most widely used artificial lift method in the petroleum industry. As with any pumping artificial lift method, ESPs exhibit failures. The maintenance of ESPs expends a lot of resources, and manpower and is usually triggered and accompanied by the reactive process monitoring of multivariate sensor data. This paper presents a methodology to deploy the principal component analysis and extreme gradient boosting trees (XGBoosting) in predictive maintenance in order to analyze real-time sensor data to predict failures in ESPs. The system contributes to an efficiency increase by reducing the time required to dismantle the pumping system, inspect it, and perform failure analysis. This objective is achieved by applying the principal component analysis as an unsupervised technique; then, its output is pipelined with an XGBoosting model for further prediction of the system status. In comparison to traditional approaches that have been utilized for the diagnosis of ESPs, the proposed model is able to identify deeper functional relationships and longer-term trends inferred from historical data. The novel workflow with the predictive model can provide signals 7 days before the actual failure event, with an F1-score more than 0.71 on the test set. Increasing production efficiencies through the proactive identification of failure events and the avoidance of deferment losses can be accomplished by means of the real-time alarming system presented in this work.

show abstract

“…Sebagian besar penggungaan fold=10 dapat memberikan hasil yang baik tetapi terkadang di beberapa kasus fold=5 sudah cukup memadai [32]. Pada penelitian yang dilakukan Nti [33], nilai fold=7 dapat memberikan peningkatan dalam akurasi validasi. Selain itu penelitian yang dilakukan Tempola [34] dibagi kedalam 3 fold disetiap metode klasifikasi diperoleh perbandingan akurasi sistem rata-rata tertinggi.…”

Section: K-fold Cross Validationunclassified

Klasifikasi Dialek Bahasa Jawa Menggunakan Metode Naives Bayes

Angeline¹,

Wibawa

Pujianto

2022

Mnemonic

View full text Add to dashboard Cite

Pulau Jawa merupakan pulau terpadat di Indonesia dan memiliki keragaman dialek yang tinggi. Berdasarkan peta bahasa yang dikeluarkan oleh KEMDIKBUD, Pulau Jawa memiliki 12 dialek utama yang tersebar di Jawa Timur, Jawa Barat dan Jawa Tengah. Dari hasil survei yang telah dilakukan, dialek yang digunakan sebagai dataset hanya dibatasi menjadi 3 dialek terpopuler dari setiap provinsi yaitu Dialek Cirebon, Dialek Tegal dan Dialek Jawa Timur. Penyediaan data dilakukan dengan metode studi literatur yang bersumber dari buku dan dokumen tertulis yang tersedia di internet. Data akan diolah dan dianalisis menggunakan algoritma Multinomial Naives Bayes karena cepat dalam proses perhitungan, sederhana dan memiliki akurasi yang tinggi. Algoritma akan diuji menggunakan K-fold Cross Validation untuk mengetahui performa algoritma Multinomial Naives Bayes dalam melakukan klasifikasi dialek di Pulau Jawa. Metode Synthetic Minority Over-Sampling Technique (SMOTE) juga digunakan dalam penelitian ini untuk mengetahui pengaruh teknik oversampling terhadap performa algoritma. Dari penelitian in dihasilkan performa terbaik dengan akurasi sebesar 96,97%, presisi sebesar 97,53% dan recall sebesar 96,83%.

show abstract

Performance of Machine Learning Algorithms with Different K Values in K-fold CrossValidation

Cited by 52 publications

References 0 publications

Mapping the impacts of multiple stressors on the decline in kelps along the coast of Victoria, Australia

Mapping the impacts of multiple stressors on the decline in kelps along the coast of Victoria, Australia

Machine Learning Approach for Predictive Maintenance of the Electrical Submersible Pumps (ESPs)

Klasifikasi Dialek Bahasa Jawa Menggunakan Metode Naives Bayes

Contact Info

Product

Resources

About