The arrangement of the subunits in an oligomeric protein often cannot be inferred without ambiguity from crystallographic studies. The annotation of the functional assembly of protein structures in the Protein Data Bank (PDB) is incomplete and frequently inconsistent. Instructions for the reconstruction, by symmetry, of the functional assembly from the deposited coordinates are often absent. An automatic procedure is proposed for the inference of assembly structures that are likely to be physiologically relevant. The method scores crystal contacts by their contact size and chemical complementarity. The subunit assembly is then inferred from these scored contacts by a clustering procedure involving a single adjustable parameter. When predicting the oligomeric state for a non-redundant set of 55 monomeric and 163 oligomeric proteins from dimers up to hexamers, a classi®cation error rate of 16% was observed.
Features of multimeric proteins are reviewed to shed light on the formation of protein assemblies from a structural perspective. The features comprise biochemical and geometric properties. They are compiled on new low-redundancy sets of crystal structures of homomeric proteins with different symmetry and subunit multiplicity, as well as on a set of heteromeric proteins. Crystal structures of likely monomers provide a control group.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.