Anna Bubnova scite author profile

Bayesian networks are a powerful tool for modelling multivariate random variables. However, when applied in practice, for example, for industrial projects, problems arise because the existing learning and inference algorithms are not adapted to real data. This article discusses two learning and inference problems on mixed data in Bayesian networks—learning and inference at nodes of a Bayesian network that have non-Gaussian distributions and learning and inference for networks that require edges from continuous nodes to discrete ones. First, an approach based on the use of mixtures of Gaussian distributions is proposed to solve a problem when the joint normality assumption is not confirmed. Second, classification models are proposed to solve a problem with edges from continuous nodes to discrete nodes. Experiments have been run on both synthetic datasets and real-world data and have shown gains in modelling quality.

show abstract

MIxBN: library for learning Bayesian networks from mixed data

Bubnova

Deeva

Kalyuzhnaya

2021

Procedia Computer Science

View full text Add to dashboard Cite

MIxBN: library for learning Bayesian networks from mixed data

Bubnova¹,

Deeva²,

Kalyuzhnaya³

2021

Preprint

View full text Add to dashboard Cite

This paper describes a new library for learning Bayesian networks from data containing discrete and continuous variables (mixed data). In addition to the classical learning methods on discretized data, this library proposes its algorithm that allows structural learning and parameters learning from mixed data without discretization since data discretization leads to information loss. This algorithm based on mixed MI score function for structural learning, and also linear regression and Gaussian distribution approximation for parameters learning. The library also offers two algorithms for enumerating graph structures -the greedy Hill-Climbing algorithm and the evolutionary algorithm. Thus the key capabilities of the proposed library are as follows: (1) structural and parameters learning of a Bayesian network on discretized data, (2) structural and parameters learning of a Bayesian network on mixed data using the MI mixed score function and Gaussian approximation, (3) launching learning algorithms on one of two algorithms for enumerating graph structures -Hill-Climbing and the evolutionary algorithm. Since the need for mixed data representation comes from practical necessity, the advantages of our implementations are evaluated in the context of solving approximation and gap recovery problems on synthetic data and real datasets.

show abstract

Oil and Gas Reservoirs Parameters Analysis Using Mixed Learning of Bayesian Networks

Deeva¹,

Bubnova²,

Andriushchenko³

et al. 2021

Preprint

View full text Add to dashboard Cite

In this paper, a multipurpose Bayesian-based method for data analysis, causal inference and prediction in the sphere of oil and gas reservoir development is considered. This allows analysing parameters of a reservoir, discovery dependencies among parameters (including cause and effects relations), checking for anomalies, prediction of expected values of missing parameters, looking for the closest analogues, and much more. The method is based on extended algorithm MixLearn@BN for structural learning of Bayesian networks. Key ideas of MixLearn@BN are following: (1) learning the network structure on homogeneous data subsets, (2) assigning a part of the structure by an expert, and (3) learning the distribution parameters on mixed data (discrete and continuous). Homogeneous data subsets are identified as various groups of reservoirs with similar features (analogues), where similarity measure may be based on several types of distances. The aim of the described technique of Bayesian network learning is to improve the quality of predictions and causal inference on such networks. Experimental studies prove that the suggested method gives a significant advantage in missing values prediction and anomalies detection accuracy. Moreover, the method was applied to the database of more than a thousand petroleum reservoirs across the globe and allowed to discover novel insights in geological parameters relationships.

show abstract

Oil reservoir recovery factor assessment using Bayesian networks based on advanced approaches to analogues clustering

Andriushchenko¹,

Deeva²,

Bubnova³

et al. 2022

Preprint

View full text Add to dashboard Cite

The work focuses on the modelling and imputation of oil and gas reservoirs parameters, specifically, the problem of predicting the oil recovery factor (RF) using Bayesian networks (BNs). Recovery forecasting is critical for the oil and gas industry as it directly affects a company's profit. However, current approaches to forecasting the RF are complex and computationally expensive. In addition, they require vast amount of data and are difficult to constrain in the early stages of reservoir development. To address this problem, we propose a BN approach and describe ways to improve parameter predictions' accuracy. Various training hyperparameters for BNs were considered, and the best ones were used. The approaches of structure and parameter learning, data discretization and normalization, subsampling on analogues of the target reservoir, clustering of networks and data filtering were considered. Finally, a physical model of a synthetic oil reservoir was used to validate BNs' predictions of the RF. All approaches to modelling based on BNs provide full coverage of the confidence interval for the RF predicted by the physical model, but at the same time require less time and data for modelling, which demonstrates the possibility of using in the early stages of reservoirs development. The main result of the work can be considered the development of a methodology for studying the parameters of reservoirs based on Bayesian networks built on small amounts of data and with minimal involvement of expert knowledge. The methodology was tested on the example of the problem of the recovery factor imputation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anna Bubnova

Oil and Gas Reservoirs Parameters Analysis Using Mixed Learning of Bayesian Networks

Analysis of parameters of oil and gas fields using Bayesian networks

Automatic Determination of Sedimentary Units from Well Data

Advanced Approach for Distributions Parameters Learning in Bayesian Networks with Gaussian Mixture Models and Discriminative Models

MIxBN: library for learning Bayesian networks from mixed data

MIxBN: library for learning Bayesian networks from mixed data

Oil and Gas Reservoirs Parameters Analysis Using Mixed Learning of Bayesian Networks

Oil reservoir recovery factor assessment using Bayesian networks based on advanced approaches to analogues clustering

Contact Info

Product

Resources

About