2019
DOI: 10.1038/s41598-019-52552-4
|View full text |Cite
|
Sign up to set email alerts
|

Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method

Abstract: Succinylation is a type of protein post-translational modification (PTM), which can play important roles in a variety of cellular processes. Due to an increasing number of site-specific succinylated peptides obtained from high-throughput mass spectrometry (MS), various tools have been developed for computationally identifying succinylated sites on proteins. However, most of these tools predict succinylation sites based on traditional machine learning methods. Hence, this work aimed to carry out the succinylati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
32
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 33 publications
(32 citation statements)
references
References 66 publications
0
32
0
Order By: Relevance
“…Finally, each element in the feature matrix PSSM x was normalized using a Sigmoid function 30 , and �(x) can be written as: w = 31 amino acid of each sequence in this study, so the sequence length ranges from −15 to +15 in this study.…”
Section: Training Setmentioning
confidence: 99%
“…Finally, each element in the feature matrix PSSM x was normalized using a Sigmoid function 30 , and �(x) can be written as: w = 31 amino acid of each sequence in this study, so the sequence length ranges from −15 to +15 in this study.…”
Section: Training Setmentioning
confidence: 99%
“…After the sequence extraction process, we focused on the analysis of sequence-based features, and then each sequence fragment was encoded based on the investigated features. The following sequence-based features are widely employed for analysis and prediction of various types of PTM sites in the enormous amount of research [18,20,21]: amino acid composition (AAC), positively charged amino acid composition (PCAAC), amino acid pair composition (AAPC), BLOSUM62 scoring matrix (B62) and position-specific scoring matrix (PSSM). In this study, the phosphoglycerylated sequences should be transformed into numeric vectors based on the above features to construct a supervised learning model.…”
Section: Features Extraction and Encodingmentioning
confidence: 99%
“…There is an 842-dimensional feature vector made up of sequential and statistical features, which was composed by three types of features including AAC, AAPC and PSSM. By referring to the CNN-SuccSite method [20], all the features were sorted and ranked according to F-score on training dataset prior to construction of predictive models. Furthermore, the sequential forward selection (SFS) [28] is a type of stepwise regression which involves beginning with an empty model and testing the addition of each variable, then adding the variables one at a time until none improves the model to a statistically significant extent.…”
Section: Selection Of the Best Hybrid Feature Setsmentioning
confidence: 99%
“…As in our previous study, the sequence-based features including amino acid composition (AAC), amino acid pair composition (AAPC), BLOSUM62 scoring matrix (B62) and position-specific scoring matrix (PSSM) that were used for the identification of protein carbonylation sites [14]. Note that all these sequence-based features have widely been employed for analysis and prediction of various types of PTM sites in the enormous amount of research [15][16][17]. In this study, the phosphoglycerylated sequences should be transformed into numeric vectors based on the above features to construct a supervised learning model.…”
Section: Features Extraction and Encodingmentioning
confidence: 99%
“…There is an 842-dimensional feature vector made up of sequential and statistical features, which was composed by three types of features including AAC, AAPC and PSSM. By referring to the CNN-SuccSite method [15], all the features were sorted and ranked according to F-score on training dataset prior to construction of predictive models.…”
Section: Features Extraction and Encodingmentioning
confidence: 99%