2020
DOI: 10.1101/2020.06.24.169011
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Structure-aware Protein Solubility Prediction From Sequence Through Graph Convolutional Network And Predicted Contact Map

Abstract: Motivation: Protein solubility is significant in producing new soluble proteins that can reduce the cost of biocatalysts or therapeutic agents. Therefore, a computational model is highly desired to accurately predict protein solubility from the amino acid sequence. Many methods have been developed, but they are mostly based on the one-dimensional embedding of amino acids that is limited to catch spatially structural information. Results:In this study, we have developed a new structure-aware method to predict p… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
2
1

Relationship

4
1

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 34 publications
0
4
0
Order By: Relevance
“…Graph neural networks (GNNs) were used to extract substrate features. 40 Substrates were considered as molecular graphs G = (ν,ϵ). Each atom v i ∈ ν was represented by a 46-dimensional vector, which was the concatenation of one-hot encodings representing the atom types, degrees of the atom, chirality, hybridization types, number of radical electrons, number of hydrogen atoms attached, explicit valence, implicit valence, and aromaticity of the corresponding atoms.…”
Section: ■ Discussionmentioning
confidence: 99%
“…Graph neural networks (GNNs) were used to extract substrate features. 40 Substrates were considered as molecular graphs G = (ν,ϵ). Each atom v i ∈ ν was represented by a 46-dimensional vector, which was the concatenation of one-hot encodings representing the atom types, degrees of the atom, chirality, hybridization types, number of radical electrons, number of hydrogen atoms attached, explicit valence, implicit valence, and aromaticity of the corresponding atoms.…”
Section: ■ Discussionmentioning
confidence: 99%
“…As molecules interact with the target protein, effective embedding will reduce the limit of the small amount of training samples. The protein information could be embedded through sequence, contact map, or 3D structure . Second, experimental costs limit the discussion about direct comparison of meta learning on recently proposed MO models, which can be covered in future work, and the effectiveness of meta learning on many other directions in low-resource drug discovery is yet to be discovered.…”
Section: Discussionmentioning
confidence: 99%
“…The protein representation was done from the perspectives of structure and sequence features, respectively. For the structural feature, we considered using a graph to represent the 2D structure of proteins, which has been proven effective for predicting protein solubility in our previous study [26]. In the protein graph model, residues were regarded as nodes and the contact map predicted from the sequence was used as the adjacency matrix.…”
Section: Representation Of Protein and Moleculementioning
confidence: 99%