2017
DOI: 10.1039/c7mb00234c
|View full text |Cite
|
Sign up to set email alerts
|

An integrative machine learning strategy for improved prediction of essential genes in Escherichia coli metabolism using flux-coupled features

Abstract: Prediction of essential genes helps to identify a minimal set of genes that are absolutely required for the appropriate functioning and survival of a cell. The available machine learning techniques for essential gene prediction have inherent problems, like imbalanced provision of training datasets, biased choice of the best model for a given balanced dataset, choice of a complex machine learning algorithm, and data-based automated selection of biologically relevant features for classification. Here, we propose… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
33
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 37 publications
(33 citation statements)
references
References 62 publications
0
33
0
Order By: Relevance
“…Traditional features were shown to give low precision for the majority of gene interactions, whereas FBA-based features brought significant improvements in predictive precision and recall, indicating that genomescale CBM captures relevant information that is missed by gene-level traditional features. The approach was tested again in the context of gene essentiality prediction by Nandi and colleagues [81], who instead employed flux coupling analysis (FCA) as feature generator to take gene adaptability into account in varying environmental conditions [82].…”
Section: Supervised Multiomic Analysismentioning
confidence: 99%
“…Traditional features were shown to give low precision for the majority of gene interactions, whereas FBA-based features brought significant improvements in predictive precision and recall, indicating that genomescale CBM captures relevant information that is missed by gene-level traditional features. The approach was tested again in the context of gene essentiality prediction by Nandi and colleagues [81], who instead employed flux coupling analysis (FCA) as feature generator to take gene adaptability into account in varying environmental conditions [82].…”
Section: Supervised Multiomic Analysismentioning
confidence: 99%
“…In 2017, Nandi et al [123] extended the hybrid methodology utilising SVM-based implementation for binary classification of E. coli genes based on gene sequencing and expression, network topology and flux-based features. By also accounting for environmental factors, the model was able to capture the minimal set of genes that are essential in any given environment.…”
Section: Hybrid Machine Learning and Constrained-based Modelling Apprmentioning
confidence: 99%
“…Machine learning models can be trained to predict and classify genes of an organism as essential or non-essential based on a training set of known, labelled essential or non-essential genes. A number of different machine learning methods has been used to try and determine gene essentiality from metabolic network information, including: SVM, ensemble-based learning, probabilistic Bayesian methods, logistic regression and decision tree-based methods (reviewed in [ 75 ]). Plaimas et al [ 76 ] have presented a machine learning strategy for the determination of essential enzymes in a metabolic network aiming towards the determination of interesting drug targets.…”
Section: Machine Learning In Metabolism Modelingmentioning
confidence: 99%
“…An example of the ensemble learning method is random forests, which combine two machine learning techniques: bagging and random feature subset selection for predictions. Nandi et al [ 75 ] present a very interesting approach for the selection of essential features using the SVM-RFE machine learning approach. In their example, the authors used a genome-scale metabolic network of Escherichia coli to create reaction-gene combinations and label the essentiality of each combination based on experimental data.…”
Section: Machine Learning In Metabolism Modelingmentioning
confidence: 99%