2021
DOI: 10.1093/nar/gkab1054
|View full text |Cite
|
Sign up to set email alerts
|

SCOPe: improvements to the structural classification of proteins – extended database to facilitate variant interpretation and machine learning

Abstract: The Structural Classification of Proteins—extended (SCOPe, https://scop.berkeley.edu) knowledgebase aims to provide an accurate, detailed, and comprehensive description of the structural and evolutionary relationships amongst the majority of proteins of known structure, along with resources for analyzing the protein structures and their sequences. Structures from the PDB are divided into domains and classified using a combination of manual curation and highly precise automated methods. In the current release o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
68
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 101 publications
(98 citation statements)
references
References 28 publications
(36 reference statements)
2
68
0
Order By: Relevance
“…Considering the class of the domains, 29.8% map to mainly-alpha superfamilies, 23.4% to mainly-beta and 46.7% to alpha-beta, and these proportions are quite similar to those observed for experimental domain structures in CATH, albeit with a slightly lower percentage of alpha superfamilies in CATH experimental (alpha: 21.2%, beta: 23.4%, alpha/beta: 46.7%). This overabundance of mainly-alpha superfamilies has been noticed also in other AF2 domain classification efforts based on SCOPe (17). Supplementary Figure 5 shows the average expansion of each CATH architecture.…”
Section: Resultssupporting
confidence: 74%
See 1 more Smart Citation
“…Considering the class of the domains, 29.8% map to mainly-alpha superfamilies, 23.4% to mainly-beta and 46.7% to alpha-beta, and these proportions are quite similar to those observed for experimental domain structures in CATH, albeit with a slightly lower percentage of alpha superfamilies in CATH experimental (alpha: 21.2%, beta: 23.4%, alpha/beta: 46.7%). This overabundance of mainly-alpha superfamilies has been noticed also in other AF2 domain classification efforts based on SCOPe (17). Supplementary Figure 5 shows the average expansion of each CATH architecture.…”
Section: Resultssupporting
confidence: 74%
“…Over the last 25 years, several domain-based protein structure classifications have emerged (SCOP (16), CATH (15), SCOPe (17), SCOP2 (18), ECOD (14)) which assign experimental structures of proteins from the Protein Data Bank (PDB) to evolutionary superfamilies. ECOD and CATH are the most comprehensive, classifying 90% or more of PDB.…”
Section: Introductionmentioning
confidence: 99%
“…Thermodynamics spontaneously drive the protein into a stable, free-energy basin that is consistent with the environment. In proteins, structure rather than sequence tends to be conserved among proteins that perform the same function, even in proteins that are analogous across species [ 9 , 10 ]. From this perspective, a protein’s 3D structure is arguably the most important physical attribute of a protein.…”
Section: Foundational Concepts Of Protein Functionmentioning
confidence: 99%
“…However, there remain open questions about the best way to generate, represent and quantify conformational ensembles. Early work includes taking a statistical approach [ 199 ], knowledge-based analysis [ 10 ] and unsupervised clustering [ 200 , 201 ]. Eventually, neural networks were applied to recognize protein folds [ 202 ] or to self-improve a predicted folded structure [ 203 ].…”
Section: Selected Applications Of Machine Learning In Computational B...mentioning
confidence: 99%
“…These same proteins have remote homologs that were not detectable by traditional sequence matching tools because of very low sequence identities. Our method is not only faster but also more accurate in application to the benchmark (21) that uses the Pfam (22), SUPERFAMILY (SCOPe 2.08) (3, 24), and CATH Gene3D (5, 26) datasets. PLAST is an alignment-free method that simply compares the embedding vectors using this accurate homology detection tool.…”
Section: Introductionmentioning
confidence: 99%