The small b-barrel (SBB) is an ancient protein structural domain characterized by extremes: it features a broad range of structural varieties, a deeply intricate evolutionary history, and it is associated with a bewildering array of cellular pathways. Here, we present a thorough, survey-based analysis of the structural properties of SBBs. We first consider the defining properties of the SBB, including various systems of nomenclature used to describe it, and we introduce the unifying concept of an ''urfold.'' To begin elucidating how vast functional diversity can be achieved by a relatively simple domain, we explore the anatomy of the SBB and its representative structural variants. Many SBB proteins assemble into cyclic oligomers as the biologically functional units; these oligomers often bind RNA, and typically exhibit great quaternary structural plasticity (homomeric and heteromeric rings, variable subunit stoichiometries, etc.). We conclude with three themes that emerge from the rich structure 4 function versatility of the SBB.Strand b1 is red, b2 is orange, b3 is yellow, and b4 is green; for the SH3 and OB folds, the fifth strand is also shown (gray). Branches along a sample path in this digraph are highlighted in tan (subtree at right), yielding the 1 Zhang and Kim, 2000); these two sheets, near the center of the image, are drawn simply to illustrate tree traversal. The base of the overall tree (at center) is a decision between the two possible configurations (parallel, antiparallel) for the simplest possible sheet-i.e., a tandem pair of strands (⇨∿∿⇨). Traversing the tree from this split ''root'' to the leaves corresponds to building-up the sheet, and the tree's branching structure elucidates the n!• •2 nÀ2 unique topologies that are possible for a sheet of n strands; the successive branches of this unrooted k-ary tree are of degrees 2, 6, 2, 2, 2. The positions of the SH3 and OB folds are indicated by cyan and purple paths (subtree at left). Other features of b-sheets are also elucidated by this hierarchical representation, such as the fact that there are 24 unique arrangements of two sequentially adjacent b-hairpin motifs (red circles, left subtree). If the origin for strand numbering is taken as arbitrary (e.g., labeling a sequence 2/3/4 does not differ from 1/2/3), then the OB topology (pink path, left) can be seen to cluster closely with the SH3; the yellow region delimits a putative SBB ''urfold'' basin in fold-space, subsuming the SH3 and OB folds.
Motivation Build a web-based 3D molecular structure viewer focusing on interactive structural analysis. Results iCn3D (I-see-in-3D) can simultaneously show 3D structure, 2D molecular contacts and 1D protein and nucleotide sequences through an integrated sequence/annotation browser. Pre-defined and arbitrary molecular features can be selected in any of the 1D/2D/3D windows as sets of residues and these selections are synchronized dynamically in all displays. Biological annotations such as protein domains, single nucleotide variations, etc. can be shown as tracks in the 1D sequence/annotation browser. These customized displays can be shared with colleagues or publishers via a simple URL. iCn3D can display structure–structure alignments obtained from NCBI’s VAST+ service. It can also display the alignment of a sequence with a structure as identified by BLAST, and thus relate 3D structure to a large fraction of all known proteins. iCn3D can also display electron density maps or electron microscopy (EM) density maps, and export files for 3D printing. The following example URL exemplifies some of the 1D/2D/3D representations: https://www.ncbi.nlm.nih.gov/Structure/icn3d/full.html?mmdbid=1TUP&showanno=1&show2d=1&showsets=1. Availability and implementation iCn3D is freely available to the public. Its source code is available at https://github.com/ncbi/icn3d. Supplementary information Supplementary data are available at Bioinformatics online.
Symmetry is an important feature of protein tertiary and quaternary structure that has been associated with protein folding, function, evolution and stability. Its emergence and ensuing prevalence has been attributed to gene duplications, fusion events, and subsequent evolutionary drift in sequence. This process maintains structural similarity and is further supported by this study. To further investigate the question of how internal symmetry evolved, how symmetry and function are related, and the overall frequency of internal symmetry, we developed an algorithm, CE-Symm, to detect pseudosymmetry within the tertiary structure of protein chains. Using a large manually curated benchmark of 1007 protein domains, we show that CE-Symm performs significantly better than previous approaches. We use CE-Symm to build a census of symmetry among domain superfamilies in SCOP and note that 18% of all superfamilies are pseudo-symmetric. Our results indicate that more domains are pseudo-symmetric than previously estimated. We establish a number of recurring types of symmetry–function relationships and describe several characteristic cases in detail. Using the Enzyme Commission classification, symmetry was found to be enriched in some enzyme classes but depleted in others. CE-Symm thus provides a methodology for a more complete and detailed study of the role of symmetry in tertiary protein structure. Availability CE-Symm can be run from the web at http://source.rcsb.org/jfatcatserver/symmetry.jsp. Source code and software binaries are also available under the GNU Lesser General Public License (v. 2.1) at https://github.com/rcsb/symmetry. An interactive census of domains identified as symmetric by CE-Symm is available from: http://source.rcsb.org/jfatcatserver/scopResults.jsp.
The spliceosome, a sophisticated molecular machine involved in the removal of intervening sequences from the coding sections of eukaryotic genes, appeared and subsequently evolved rapidly during the early stages of eukaryotic evolution. The last eukaryotic common ancestor (LECA) had both complex spliceosomal machinery and some spliceosomal introns, yet little is known about the early stages of evolution of the spliceosomal apparatus. The Sm/Lsm family of proteins has been suggested as one of the earliest components of the emerging spliceosome and hence provides a first in-depth glimpse into the evolving spliceosomal apparatus. An analysis of 335 Sm and Sm-like genes from 80 species across all three kingdoms of life reveals two significant observations. First, the eukaryotic Sm/Lsm family underwent two rapid waves of duplication with subsequent divergence resulting in 14 distinct genes. Each wave resulted in a more sophisticated spliceosome, reflecting a possible jump in the complexity of the evolving eukaryotic cell. Second, an unusually high degree of conservation in intron positions is observed within individual orthologous Sm/Lsm genes and between some of the Sm/Lsm paralogs. This suggests that functional spliceosomal introns existed before the emergence of the complete Sm/Lsm family of proteins; hence, spliceosomal machinery with considerably fewer components than today's spliceosome was already functional.
iCn3D was initially developed as a web-based 3D molecular viewer. It then evolved from visualization into a full-featured interactive structural analysis software. It became a collaborative research instrument through the sharing of permanent, shortened URLs that encapsulate not only annotated visual molecular scenes, but also all underlying data and analysis scripts in a FAIR manner. More recently, with the growth of structural databases, the need to analyze large structural datasets systematically led us to use Python scripts and convert the code to be used in Node. js scripts. We showed a few examples of Python scripts at https://github.com/ncbi/icn3d/tree/master/icn3dpython to export secondary structures or PNG images from iCn3D. Users just need to replace the URL in the Python scripts to export other annotations from iCn3D. Furthermore, any interactive iCn3D feature can be converted into a Node. js script to be run in batch mode, enabling an interactive analysis performed on one or a handful of protein complexes to be scaled up to analysis features of large ensembles of structures. Currently available Node. js analysis scripts examples are available at https://github.com/ncbi/icn3d/tree/master/icn3dnode. This development will enable ensemble analyses on growing structural databases such as AlphaFold or RoseTTAFold on one hand and Electron Microscopy on the other. In this paper, we also review new features such as DelPhi electrostatic potential, 3D view of mutations, alignment of multiple chains, assembly of multiple structures by realignment, dynamic symmetry calculation, 2D cartoons at different levels, interactive contact maps, and use of iCn3D in Jupyter Notebook as described at https://pypi.org/project/icn3dpy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.