2020 Systems and Information Engineering Design Symposium (SIEDS) 2020
DOI: 10.1109/sieds49339.2020.9106642
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning of Protein Structural Classes: Any Evidence for an ‘Urfold’?

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 36 publications
0
2
0
Order By: Relevance
“…In our -based dataset (), each atom is attributed with the following seven groups of features, which are one-hot (Boolean) encoded: (i) Atom Type (, , , , , ); (ii) Residue Type (, , , , , , , , , , , , , , , , , , , , ); (iii) Secondary Structure (, , ); (iv) Hydrophobic (or not); (v) Electronegative (or not); (vi) Positively-charged (or not); and (vii) Solvent-exposed (or not). For all of the DeepUrfold final production models reported here, the “residue type” feature was omitted because it was found to be uninformative, at least for this type of representation (see Supp Info §3 and Supp Figs S4-5); interestingly, this finding about the dispensability of a residue-type feature was presaged in early work on this project (e.g., the receiver operating characteristic (ROC) curves in Fig 2 of ref [66]).…”
Section: Resultsmentioning
confidence: 99%
“…In our -based dataset (), each atom is attributed with the following seven groups of features, which are one-hot (Boolean) encoded: (i) Atom Type (, , , , , ); (ii) Residue Type (, , , , , , , , , , , , , , , , , , , , ); (iii) Secondary Structure (, , ); (iv) Hydrophobic (or not); (v) Electronegative (or not); (vi) Positively-charged (or not); and (vii) Solvent-exposed (or not). For all of the DeepUrfold final production models reported here, the “residue type” feature was omitted because it was found to be uninformative, at least for this type of representation (see Supp Info §3 and Supp Figs S4-5); interestingly, this finding about the dispensability of a residue-type feature was presaged in early work on this project (e.g., the receiver operating characteristic (ROC) curves in Fig 2 of ref [66]).…”
Section: Resultsmentioning
confidence: 99%
“…More recently, various ML methods have been applied to learn the statistical laws between feature descriptors of protein sequences in a training dataset and their corresponding structural classes, and to build a probabilistic model for classification purposes, as can been seen in a recent review on protein function prediction [38]. In the following, we focus on the very recent applications of ML methods in the prediction of PSC, which include artificial neural networks [15,22], support vector machine [23,26,50,77], K-nearest neighbor [16,17,46], random forest [4], logistic regression [78,79], and deep learning [80][81][82][83][84]. Table 3 provides a list of machine learning algorithms and their recent variants used as classification models in the prediction of proteins structural classes.…”
Section: Classification Modelsmentioning
confidence: 99%
“…While all of the possible features are contained in the Prop3D-20sf dataset and undoubtedly will be somewhat correlated, it is possible for one to select only certain subsets of features of interest. We also create subsets of the Boolean features that we have found to be minimally correlated [46], and those can be selected, for example, in training deep neural networks.…”
Section: Introductionmentioning
confidence: 99%