2015
DOI: 10.1007/s12539-014-0257-2
|View full text |Cite
|
Sign up to set email alerts
|

Improved feature-based prediction of SNPs in human cytochrome P450 enzymes

Abstract: Single nucleotide polymorphisms (SNPs) make up the most common form of mutations in human cytochrome P450 enzymes family, and have the potential to bring with different drug responses or specific diseases in individual patients. Here, based on machine learning technology, we aim to explore an effective set of sequence-based features for improving prediction of SNPs by using support vector machine algorithms. The features are derived from the target residues and flanking protein sequences, such as amino acid ty… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 30 publications
0
7
0
Order By: Relevance
“…The following four metrics are commonly used in literature to measure the quality of binary classification (Xiong et al, 2012 ; Li et al, 2015 ): sensitivity, specificity, accuracy and Matthews' correlation coefficient (MCC), which are expressed as…”
Section: Methodsmentioning
confidence: 99%
“…The following four metrics are commonly used in literature to measure the quality of binary classification (Xiong et al, 2012 ; Li et al, 2015 ): sensitivity, specificity, accuracy and Matthews' correlation coefficient (MCC), which are expressed as…”
Section: Methodsmentioning
confidence: 99%
“…It is because imbalanced-class data exist in this study (e.g., 1208 (6%) for UPRA vs. 20,684 (94%) for non-UPRA). High accuracies rates with imbalanced SENS and SPEC are expected in imbalanced-class data using the traditional approaches [ 18 , 19 , 20 , 21 ]. Thus, we applied the minimization of average model residuals in both classes (i) to obtain balanced SENS and SPEC and (ii) to overcome the disadvantage of high accuracy rates (i.e., the minimum residuals minimized by the formula of average (residuals in UPRA) + average(residuals in non-UPRA)).…”
Section: Methodsmentioning
confidence: 99%
“…For instance, Wang et al [ 16 ] developed a real-time model using the time series of vital signs and discrete features, such as laboratory tests. However, this model’s prediction accuracy was not sufficiently high (area under the receiver operating characteristic curve (AUC) = 0.70) [ 17 ] to deploy the model in the hospital information system with the proposed forecasting algorithms to support treatment because many false-positive cases appear in these imbalanced-class data [ 18 , 19 , 20 , 21 ], increasing the clinicians’ burden.…”
Section: Introductionmentioning
confidence: 99%
“…The balanced-class data were another important issue that should be considered. Otherwise, the imbalanced-class data [ 24 , 25 ] lead to an extremely imbalanced ratio (= SENS/SPEC or SPEC/SENS) while the modle pursuits the ultimate accurate rate of prediction (i.e., by minimizing the residuals). In this study.…”
Section: Methodsmentioning
confidence: 99%