Towards Optimization of Boosting Models for Formation Lithology Identification

Xie, Yunxin; Zhu, Chenyang; Lu, Yue; Zhu, Zhengwei

doi:10.1155/2019/5309852

Cited by 15 publications

(3 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Built on their work, Dev and Eden (2018) applied AdaBoost and LogitBoost with random tree-based learners, achieving higher performance metrics. Xie et al (2019) applied regularization on GTB and xgboosting and stacked the classifiers to improve the classification accuracy. Tewari and Dwivedi (2020) also showed that the heterogeneous ensemble methods, namely voting and stacking, could improve the prediction accuracy for mudstone lithofacies in a Kansas oil-field area.…”

Section: Related Workmentioning

confidence: 99%

A Coarse-to-Fine Approach for Intelligent Logging Lithology Identification with Extremely Randomized Trees

Xie

Zhu

et al. 2020

Math Geosci

Self Cite

View full text Add to dashboard Cite

Lithology identification is vital for reservoir exploration and petroleum engineering. Recently, there has been growing interest in using an intelligent logging approach for lithology classification. Machine learning has emerged as a powerful tool in inferring lithology types with the logging curves. However, well logs are susceptible to logging parameter manual entry, borehole conditions and tool calibrations. Most studies in the field of lithology classification with machine learning approaches have focused only on improving the prediction accuracy of classifiers. Also, a model trained in one location is not reusable in a new location due to different data distributions. In this paper, a unified framework is provided for training a multi-class lithology classification model for a data set with outlier data. In this paper, a coarse-to-fine framework that combines outlier detection, multi-class classification with an extremely randomized tree-based classifier is proposed to solve these issues. An unsupervised learning approach is used to detect the outliers in the data set. Then a coarse-to-fine inference procedure is used to infer the lithology class with an extremely randomized tree classifier. Two real-world data sets of well-logging are used to demonstrate the effectiveness of the proposed framework. Comparisons are conducted with some baseline machine learning classifiers, namely random forest, gradient tree boosting, and xgboosting. Results show that the proposed framework has higher prediction accuracy in sandstones compared with other approaches.

show abstract

Section: Related Workmentioning

confidence: 99%

A Coarse-to-Fine Approach for Intelligent Logging Lithology Identification with Extremely Randomized Trees

Xie

Zhu

et al. 2020

Math Geosci

Self Cite

View full text Add to dashboard Cite

show abstract

“…Xie et al (2018) compared naive Bayes, support vector machines, artificial neural networks, RF, and gradient boosting decision tree (GBDT) algorithms when identifying lithology, and found that GBDT and RF had better identification effects than other algorithms. Xie et al (2019) evaluated three boosting models, AdaBoost, Gradient Tree boosting and eXtreme Gradient boosting, using the 5-fold cross-verification method, and combined the optimized three models together using the stacking method to improve the classification accuracy. The results show that the optimized stacked boosting model is superior to the single optimized boosting model.…”

Section: Introductionmentioning

confidence: 99%

A Real‐time Lithological Identification Method based on SMOTE‐Tomek and ICSA Optimization

DENG,

PAN,

et al. 2024

Acta Geologica Sinica (Eng)

View full text Add to dashboard Cite

In petroleum engineering, real‐time lithology identification is very important for reservoir evaluation, drilling decisions and petroleum geological exploration. A lithology identification method while drilling based on machine learning and mud logging data is studied in this paper. This method can effectively utilize downhole parameters collected in real‐time during drilling, to identify lithology in real‐time and provide a reference for optimization of drilling parameters. Given the imbalance of lithology samples, the synthetic minority over‐sampling technique (SMOTE) and Tomek link were used to balance the sample number of five lithologies. Meanwhile, this paper introduces Tent map, random opposition‐based learning and dynamic perceived probability to the original crow search algorithm (CSA), and establishes an improved crow search algorithm (ICSA). In this paper, ICSA is used to optimize the hyperparameter combination of random forest (RF), extremely random trees (ET), extreme gradient boosting (XGB), and light gradient boosting machine (LGBM) models. In addition, this study combines the recognition advantages of the four models. The accuracy of lithology identification by the weighted average probability model reaches 0.877. The study of this paper realizes high‐precision real‐time lithology identification method, which can provide lithology reference for the drilling process.

show abstract

“…Well-logging has been utilized as an effective remote sensing measurement to predict underground formation lithology from a surface geophysical survey. Well-logging data contains rich geological information, which is a synthesized reflection of formation lithology and physical properties [4].…”

Section: Introductionmentioning

confidence: 99%

Well-Logging-Based Lithology Classification Using Machine Learning Methods for High-Quality Reservoir Identification: A Case Study of Baikouquan Formation in Mahu Area of Junggar Basin, NW China

Zhang

et al. 2022

Energies

View full text Add to dashboard Cite

The identification of underground formation lithology is fundamental in reservoir characterization during petroleum exploration. With the increasing availability and diversity of well-logging data, automated interpretation of well-logging data is in great demand for more efficient and reliable decision making for geologists and geophysicists. This study benchmarked the performances of an array of machine learning models, from linear and nonlinear individual classifiers to ensemble methods, on the task of lithology identification. Cross-validation and Bayesian optimization were utilized to optimize the hyperparameters of different models and performances were evaluated based on the metrics of accuracy—the area under the receiver operating characteristic curve (AUC), precision, recall, and F1-score. The dataset of the study consists of well-logging data acquired from the Baikouquan formation in the Mahu Sag of the Junggar Basin, China, including 4156 labeled data points with 9 well-logging variables. Results exhibit that ensemble methods (XGBoost and RF) outperform the other two categories of machine learning methods by a material margin. Within the ensemble methods, XGBoost has the best performance, achieving an overall accuracy of 0.882 and AUC of 0.947 in classifying mudstone, sandstone, and sandy conglomerate. Among the three lithology classes, sandy conglomerate, as in the potential reservoirs in the study area, can be best distinguished with accuracy of 97%, precision of 0.888, and recall of 0.969, suggesting the XGBoost model as a strong candidate machine learning model for more efficient and accurate lithology identification and reservoir quantification for geologists.

show abstract

Towards Optimization of Boosting Models for Formation Lithology Identification

Cited by 15 publications

References 26 publications

A Coarse-to-Fine Approach for Intelligent Logging Lithology Identification with Extremely Randomized Trees

A Coarse-to-Fine Approach for Intelligent Logging Lithology Identification with Extremely Randomized Trees

A Real‐time Lithological Identification Method based on SMOTE‐Tomek and ICSA Optimization

Well-Logging-Based Lithology Classification Using Machine Learning Methods for High-Quality Reservoir Identification: A Case Study of Baikouquan Formation in Mahu Area of Junggar Basin, NW China

Contact Info

Product

Resources

About