2013
DOI: 10.1371/journal.pone.0056621
|View full text |Cite
|
Sign up to set email alerts
|

On the Relevance of Sophisticated Structural Annotations for Disulfide Connectivity Pattern Prediction

Abstract: Disulfide bridges strongly constrain the native structure of many proteins and predicting their formation is therefore a key sub-problem of protein structure and function inference. Most recently proposed approaches for this prediction problem adopt the following pipeline: first they enrich the primary sequence with structural annotations, second they apply a binary classifier to each candidate pair of cysteines to predict disulfide bonding probabilities and finally, they use a maximum weight graph matching al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(17 citation statements)
references
References 41 publications
0
17
0
Order By: Relevance
“…Indeed, contrarily to the observation made in [1] that suggested a very small number of relevant feature functions in the context of disulfide bridge prediction, the selection algorithm identified here a larger set of interesting feature functions.…”
Section: Resultsmentioning
confidence: 56%
See 1 more Smart Citation
“…Indeed, contrarily to the observation made in [1] that suggested a very small number of relevant feature functions in the context of disulfide bridge prediction, the selection algorithm identified here a larger set of interesting feature functions.…”
Section: Resultsmentioning
confidence: 56%
“…For this purpose, we consider various feature encodings and, in addition to the primary structure, three in-sillico annotations: position-specific scoring matrices (PSSM), predicted secondary structures and predicted solvent accessibilities. We apply the feature function selection pipeline in combination with Extremely randomized Trees (ETs), a model which gave excellent results in previous work [1]. In order to avoid any risk of overfitting or over-estimation of our models, we use three distinct datasets: Disorder723 [19], Casp10 (http://www.predictioncenter.org/casp10/) and Pdb30.…”
Section: Introductionmentioning
confidence: 99%
“…The method proposed by Becker et al 6 employs three different classification algorithms for the prediction of disulfide bonding probabilities: k-nearest neighbors, SVMs, and extremely randomized trees. Therefore, they propose a feature function selection, which determines a subset of feature functions and the best setting for associated window sizes.…”
Section: Methodsmentioning
confidence: 99%
“…Sequence alignment is a standard technique in bioinformatics for visualizing the relationships between residues in a collection of evolutionary or structurally related proteins. Existing DCP algorithms in the literature have used multiple sequence alignment, position-specific scoring matrices (PSSMs) 6 , 7 and correlated mutations 4 as input encoding.…”
Section: Preliminary Conceptsmentioning
confidence: 99%
“…As the position specific scoring matrix encodes the evolutionary information of a protein, the feature derived from PSSM has been widely and successfully applied to disulfide connectivity predictions [23], [34], [43]. In this study, we extract the PSSM feature as follows: the original PSSM of a given protein sequence is obtained by executing PSI-BLAST [44] to search the Swiss-Prot database through three iterations with a default E-value cutoff; then, we transform the original PSSM to a normalized one by applying the logistic function fðxÞ ¼ 1= 1 þ e Àx ð Þto each element x contained in the original PSSM; finally, each cysteine residue is encoded into a 13 Â 20 ¼ 260-D feature vector that consists of the normalized PSSM elements corresponding to a sequence segment of length 13 centered on the cysteine residue [23], [34].…”
Section: Position Specific Scoring Matrix Featurementioning
confidence: 99%