2018
DOI: 10.1093/bib/bby077
|View full text |Cite
|
Sign up to set email alerts
|

Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods

Abstract: The roles of proteolytic cleavage have been intensively investigated and discussed during the past two decades. This irreversible chemical process has been frequently reported to influence a number of crucial biological processes (BPs), such as cell cycle, protein regulation and inflammation. A number of advanced studies have been published aiming at deciphering the mechanisms of proteolytic cleavage. Given its significance and the large number of functionally enriched substrates targeted by specific proteases… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
40
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
3
1

Relationship

3
6

Authors

Journals

citations
Cited by 75 publications
(40 citation statements)
references
References 115 publications
(139 reference statements)
0
40
0
Order By: Relevance
“…To evaluate the performance of the selected classifier models for cleavage site prediction of multiple proteases, we carried out a 5-fold cross validation (5CV) test on each of selected proteases under investigation in this study. Moreover, the predictive performance on several 19 external datasets was explored. Firstly, we trained and compared the predictive performance of models with two learning methods LR and SVC using two sizes of local windows P1-P1' and P4-P4' For evaluation of predictive performance during 5CV test the following criteria were calculated: accuracy, ROC AUC, sensitivity and specificity.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…To evaluate the performance of the selected classifier models for cleavage site prediction of multiple proteases, we carried out a 5-fold cross validation (5CV) test on each of selected proteases under investigation in this study. Moreover, the predictive performance on several 19 external datasets was explored. Firstly, we trained and compared the predictive performance of models with two learning methods LR and SVC using two sizes of local windows P1-P1' and P4-P4' For evaluation of predictive performance during 5CV test the following criteria were calculated: accuracy, ROC AUC, sensitivity and specificity.…”
Section: Resultsmentioning
confidence: 99%
“…These approaches use as an input data extracted from databases described above. Several reviews were published to compare developed models and their predictive performance [2,19,33]. These approaches can be classified in four main groups depending on the way how models were developed.…”
Section: Introductionmentioning
confidence: 99%
“…To evaluate the prediction performance of XG-m7G we used four metrics, that is, Sn, Sp, Acc, and MCC, which have previously been used to assess the performance of predictors in other studies. 33 , 34 We also used ROC curves, 35 , 36 , 37 , 38 which plot the true-positive rate against the false-positive rate, and AUC to further assess the model performance. Sn, Sp, Acc, and MCC are defined as follows: where represents the total number of m7G site-containing sequences, represents the total number of non-m7G sequences, represents the number of m7G site-containing sequences incorrectly predicted as non-m7G sequences, and represents the number of non-m7G sequences incorrectly predicted as m7G site-containing sequences.…”
Section: Methodsmentioning
confidence: 99%
“…where + represents the total number of positive samples investigated, while − + is the number of positive samplesincorrectly predicted to be of negative one; − the total number of negative samples investigated, while + − the number of the negative samples incorrectly predicted to be of positive one. The set of intuitive metrics has been concurred and applauded by a series of recent publications (see, e.g., [14, 16, 57-59, 74, 82, 87-100] [83,[101][102][103][104][105][106][107][108][109][110][111][112][113][114]). It is instructive t point out, however, either the conventional metrics [84] taken from math books or the intuitive metrics of Eq.5 are valid only for single label systems (where each of the constituent samples belong to one, and only one, attribute or class); for the multi-label systems (where a sample may simultaneously belong to several different attributes or classes) whose existence has become more frequent in system biology [6,7,29,[115][116][117][118][119][120][121][122][123][124][125][126][127][128][129][130][131][132][133][134], system medicine [135,…”
Section: A Set Of Intuitive Metricsmentioning
confidence: 99%