2005
DOI: 10.1371/journal.pcbi.0010031.eor
|View full text |Cite
|
Sign up to set email alerts
|

Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets and Homology Models

Abstract: The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB), target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome ann… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
15
0

Year Published

2005
2005
2019
2019

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(15 citation statements)
references
References 37 publications
0
15
0
Order By: Relevance
“…The SNPs were classified with a support vector machine trained on amino acid residue substitutions from more than 1,500 human proteins. Because X-ray crystal structures are not available for most human proteins [77], we built homology models with an automated modeling pipeline MODPIPE that relies on the MODELLER package for fold assignment, sequence-structure alignment, model building, and model assessment [78]. A small number of these predictions have been validated by biochemical and epidemiological studies found in the literature.…”
Section: Generalizability Of the Methodsmentioning
confidence: 99%
“…The SNPs were classified with a support vector machine trained on amino acid residue substitutions from more than 1,500 human proteins. Because X-ray crystal structures are not available for most human proteins [77], we built homology models with an automated modeling pipeline MODPIPE that relies on the MODELLER package for fold assignment, sequence-structure alignment, model building, and model assessment [78]. A small number of these predictions have been validated by biochemical and epidemiological studies found in the literature.…”
Section: Generalizability Of the Methodsmentioning
confidence: 99%
“…Consequently, the four major PSI Centres have targeted large sequence families that are most likely to adopt novel structures [1] (http://www.structuralgenomics.org), although these often have little or no functional annotation [3]. To maximise the biomedical benefit of PSI structures, recent reviews have proposed broadening selection criteria to explicitly focus on the relevance to human disease [4] or to provide structural characterisation of families with known functions [5]. Figure 1 shows that as the international genomics initiatives gather pace, both the number of sequences and protein families is still growing at an exponential rate, although the rate of expansion of protein families is substantially less.…”
Section: Introductionmentioning
confidence: 99%
“…Overall HCPIN has 78% (45%), 46% (20%) single domain (residue) coverage at medium and high accuracy modeling levels, respectively. This single domain coverage of HCPIN proteins is significantly higher than the estimated average single domain coverage of the human proteome (27).…”
Section: Structural Coverage Of Hcpin Proteinsmentioning
confidence: 63%
“…Although this cutoff is somewhat arbitrary, models generated from such templates will usually be of high reliability and accuracy. Such high quality structures or models of these human proteins are potentially useful for active site docking, studying catalytic mechanism, and designing ligands useful for drug discovery (67 (27), i.e. either an experimental structure or a structure template useful for medium accuracy modeling of at least part of the protein structure.…”
Section: Structural Coverage Of Hcpin Proteinsmentioning
confidence: 99%
See 1 more Smart Citation