Anton Voskresenskiy scite author profile

In this paper, a multipurpose Bayesian-based method for data analysis, causal inference and prediction in the sphere of oil and gas reservoir development is considered. This allows analysing parameters of a reservoir, discovery dependencies among parameters (including cause and effects relations), checking for anomalies, prediction of expected values of missing parameters, looking for the closest analogues, and much more. The method is based on extended algorithm MixLearn@BN for structural learning of Bayesian networks. Key ideas of MixLearn@BN are following: (1) learning the network structure on homogeneous data subsets, (2) assigning a part of the structure by an expert, and (3) learning the distribution parameters on mixed data (discrete and continuous). Homogeneous data subsets are identified as various groups of reservoirs with similar features (analogues), where similarity measure may be based on several types of distances. The aim of the described technique of Bayesian network learning is to improve the quality of predictions and causal inference on such networks. Experimental studies prove that the suggested method gives a significant advantage in missing values prediction and anomalies detection accuracy. Moreover, the method was applied to the database of more than a thousand petroleum reservoirs across the globe and allowed to discover novel insights in geological parameters relationships.

show abstract

Rock Type Classification Models Interpretability Using Shapley Values

Voskresenskiy

Bukhanov

Kuntsevich

et al. 2021

View full text Add to dashboard Cite

We propose a methodology to improve rock type classification using machine learning (ML) techniques and to reveal causal inferences between reservoir quality and well log measurements. Rock type classification is an essential step in accurate reservoir modeling and forecasting. Machine learning approaches allow to automate rock type classification based on different well logs and core data. In order to choose the best model which does not progradate uncertainty further into the workflow it is important to interpret machine learning results. Feature importance and feature selection methods are usually employed for that. We propose an extension to existing approaches - model agnostic sensitivity algorithm based on Shapley values. The paper describes a full workflow to rock type prediction using well log data: from data preparation, model building, feature selection to causal inference analysis. We made ML models that classify rock types using well logs (sonic, gamma, density, photoelectric and resistivity) from 21 wells as predictors and conduct a causal inference analysis between reservoir quality and well logs responses using Shapley values (a concept from a game theory). As a result of feature selection, we obtained predictors which are statistically significant and at the same time relevant in causal relation context. Macro F1-score of the best obtained models for both cases is 0.79 and 0.85 respectively. It was found that the ML models can infer domain knowledge, which allows us to confirm the adequacy of the built ML model for rock types prediction. Our insight was to recognize the need to properly account for the underlying causal structure between the features and rock types in order to derive meaningful and relevant predictors that carry a significant amount of information contributing to the final outcome. Also, we demonstrate the robustness of revealed patterns by applying the Shapley values methodology to a number of ML models and show consistency in order of the most important predictors. Our analysis shows that machine learning classifiers gaining high accuracy tend to mimic physical principles behind different logging tools, in particular: the longer the travel time of an acoustic wave the higher probability that media is represented by reservoir rock and vice versa. On the contrary lower values of natural radioactivity and density of rock highlight the presence of a reservoir. The article presents causal inference analysis of ML classification models using Shapley values on 2 real-world reservoirs. The rock class labels from core data are used to train a supervised machine learning algorithm to predict classes from well log response. The aim of supervised learning is to label a small portion of a dataset and allow the algorithm to automate the rest. Such data-driven analysis may optimize well logging, coring, and core analysis programs. This algorithm can be extended to any other reservoir to improve rock type prediction. The novelty of the paper is that such analysis reveals the nature of decisions made by the ML model and allows to apply truly robust and reliable petrophysics-consistent ML models for rock type classification.

show abstract

Oil reservoir recovery factor assessment using Bayesian networks based on advanced approaches to analogues clustering

Andriushchenko¹,

Deeva²,

Bubnova³

et al. 2022

Preprint

View full text Add to dashboard Cite

The work focuses on the modelling and imputation of oil and gas reservoirs parameters, specifically, the problem of predicting the oil recovery factor (RF) using Bayesian networks (BNs). Recovery forecasting is critical for the oil and gas industry as it directly affects a company's profit. However, current approaches to forecasting the RF are complex and computationally expensive. In addition, they require vast amount of data and are difficult to constrain in the early stages of reservoir development. To address this problem, we propose a BN approach and describe ways to improve parameter predictions' accuracy. Various training hyperparameters for BNs were considered, and the best ones were used. The approaches of structure and parameter learning, data discretization and normalization, subsampling on analogues of the target reservoir, clustering of networks and data filtering were considered. Finally, a physical model of a synthetic oil reservoir was used to validate BNs' predictions of the RF. All approaches to modelling based on BNs provide full coverage of the confidence interval for the RF predicted by the physical model, but at the same time require less time and data for modelling, which demonstrates the possibility of using in the early stages of reservoirs development. The main result of the work can be considered the development of a methodology for studying the parameters of reservoirs based on Bayesian networks built on small amounts of data and with minimal involvement of expert knowledge. The methodology was tested on the example of the problem of the recovery factor imputation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.