Background:
Ubiquitination, as a post-translational modification, is a crucial biological
process in cell signaling, apoptosis, and localization. Identification of ubiquitination proteins is of fundamental
importance for understanding the molecular mechanisms in biological systems and diseases.
Although high-throughput experimental studies using mass spectrometry have identified many ubiquitination
proteins and ubiquitination sites, the vast majority of ubiquitination proteins remain undiscovered,
even in well-studied model organisms.
Objective:
To reduce experimental costs, computational methods have been introduced to predict
ubiquitination sites, but the accuracy is unsatisfactory. If it can be predicted whether a protein can be
ubiquitinated or not, it will help in predicting ubiquitination sites. However, all the computational
methods so far can only predict ubiquitination sites.
Methods:
In this study, the first computational method for predicting ubiquitination proteins without
relying on ubiquitination site prediction has been developed. The method extracts features from sequence
conservation information through a grey system model, as well as functional domain annotation
and subcellular localization.
Results:
Together with the feature analysis and application of the relief feature selection algorithm,
the results of 5-fold cross-validation on three datasets achieved a high accuracy of 90.13%, with Matthew’s
correlation coefficient of 80.34%. The predicted results on an independent test data achieved
87.71% as accuracy and 75.43% of Matthew’s correlation coefficient, better than the prediction from
the best ubiquitination site prediction tool available.
Conclusion:
Our study may guide experimental design and provide useful insights for studying the
mechanisms and modulation of ubiquitination pathways. The code is available at:
https://github.com/Chunhuixu/UBIPredic_QWRCHX.