2017
DOI: 10.1038/s41598-017-06219-7
|View full text |Cite
|
Sign up to set email alerts
|

Knowledge-transfer learning for prediction of matrix metalloprotease substrate-cleavage sites

Abstract: Matrix Metalloproteases (MMPs) are an important family of proteases that play crucial roles in key cellular and disease processes. Therefore, MMPs constitute important targets for drug design, development and delivery. Advanced proteomic technologies have identified type-specific target substrates; however, the complete repertoire of MMP substrates remains uncharacterized. Indeed, computational prediction of substrate-cleavage sites associated with MMPs is a challenging problem. This holds especially true when… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
10
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
7
1
1

Relationship

3
6

Authors

Journals

citations
Cited by 21 publications
(13 citation statements)
references
References 96 publications
2
10
0
Order By: Relevance
“…Its accuracy is lowest for matrix metalloproteases, including MMP-2 and MMP-3. The current study and several previous studies [29,52,135] confirmed the prediction of cleavage sites for these proteases to be an especially challenging problem and highlighted the need to develop specialized methods for improved MMP cleavage site prediction.…”
Section: Limitations and Future Worksupporting
confidence: 74%
See 1 more Smart Citation
“…Its accuracy is lowest for matrix metalloproteases, including MMP-2 and MMP-3. The current study and several previous studies [29,52,135] confirmed the prediction of cleavage sites for these proteases to be an especially challenging problem and highlighted the need to develop specialized methods for improved MMP cleavage site prediction.…”
Section: Limitations and Future Worksupporting
confidence: 74%
“…If not addressed, this can result in models that favor negative predictions over positive [29,51]. To address this data imbalance issue, we used a down-sampling strategy, randomly discarding from the overrepresented negative samples, to impose a ratio of 1 positive to every 3 negatives, as previously suggested [15,25,29,51,52].…”
Section: Positive and Negative Samplesmentioning
confidence: 99%
“…Second, the large imbalance between cleavage sites vs. non-cleavage sites (i.e., the number of non-cleavage sites vastly outweighs the number of cleavage sites) resulting in unbalanced datasets. The first challenge was addressed by (i) complimenting the MEROPS cleavage data with data available in the Eckhard 2016 study [ 49 ] and; (ii) using transfer learning approaches, where a general protease cleavage model was pretrained using all the available data (separately for MMPs and other proteases) and subsequently used as a starting point to train protease-specific models (using only the cleavage data for a specific protease) [ 54 ]. The second issue was addressed by weighting the model prior to model training to “pay more attention” to the minority (cleavage sites) class.…”
Section: Resultsmentioning
confidence: 99%
“…To overcome this, we assigned larger weights to cleavage sites than non-cleavage sites, enforcing the classifier to "pay more attention" to the underrepresented class. We also utilised transfer learning to overcome the issue with limited amounts of data available for certain proteases (67). We ensured that the general model and protease-specific model contained the same identities for testing, training, and validating.…”
Section: Architecture Of the Deep Rnnmentioning
confidence: 99%