2015
DOI: 10.1007/978-3-319-11680-8_46
|View full text |Cite
|
Sign up to set email alerts
|

A Comparative Study of Classification-Based Machine Learning Methods for Novel Disease Gene Prediction

Abstract: Abstract. Prediction of novel genes associated to a disease is an important issue in biomedical research. At early days, annotation-based methods were proposed for this problem. In next stage, with high-throughput technologies, data of interaction between genes/proteins has grown quickly and covered almost genome and proteome, and therefore network-based methods for the issue is becoming prominent. Besides those two methods, the prediction problem can be also approached using machine learning techniques becaus… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 21 publications
(8 citation statements)
references
References 47 publications
0
8
0
Order By: Relevance
“…Supervised machine learning methods have been widely used to predict disease genes. A comparison of classification based methods can be found in Le et al [17].…”
Section: Methodsmentioning
confidence: 99%
“…Supervised machine learning methods have been widely used to predict disease genes. A comparison of classification based methods can be found in Le et al [17].…”
Section: Methodsmentioning
confidence: 99%
“…In which, the problem is considered as a classification one, where a classifier is learned from training data; then the learned classifier is used to predict whether or not a test/candidate gene is a disease gene. Briefly, at the early, machine learningbased studies usually approached disease gene prediction as a binary classification problem [9], where the learning samples are comprised of positive training samples and negative training samples [9] such as decision trees (DT) [10,11] k-nearest neighbor (kNN) [12], naive Bayesian classifier [13,14], binary support vector machine classifier [15][16][17], artificial neural network (ANN) techniques [18] and random forest (RF) [9]. In these binary classifier-based methods, positive training samples are constructed from known disease genes, whereas negative training samples are the remaining which are not known to be associated with diseases.…”
Section: Introductionmentioning
confidence: 99%
“…Additionally, some studies showed integration of the disease phenotypic similarities enhancing their performances [96,125] since the primary premise implies that the genes associated with a disease also underlie the pathogenesis of those phenotypically similar. Different frameworks and methods and their relative performances for integrative data analysis have been covered in previous reviews [35][36][37][38][39][40][41]. Despite the significant contribution of a disease similarity framework to uncover the underpinning molecular mechanisms of comorbidity, its current formulation needs the integration of other types of data to explain how the interaction between different cellular domains underpins comorbidity.…”
Section: Discussionmentioning
confidence: 99%
“…Several reviews have extensively described and evaluated the performance of varying methods for generating disease similarity metrics in the context of data integration strategies and modeling [35][36][37][38][39][40][41]. Here, we focused on five categories of methods based on network, statistics, machine learning, information retrieval and overlap (Fig 2-3, and Table 1).…”
Section: Methodology Used To Generate Disease Similarity Metricsmentioning
confidence: 99%