BackgroundThe RCSB Protein Data Bank (PDB) provides public access to experimentally determined 3D-structures of biological macromolecules (proteins, peptides and nucleic acids). While various tools are available to explore the PDB, options to access the global structural diversity of the entire PDB and to perceive relationships between PDB structures remain very limited.MethodsA 136-dimensional atom pair 3D-fingerprint for proteins (3DP) counting categorized atom pairs at increasing through-space distances was designed to represent the molecular shape of PDB-entries. Nearest neighbor searches examples were reported exemplifying the ability of 3DP-similarity to identify closely related biomolecules from small peptides to enzyme and large multiprotein complexes such as virus particles. The principle component analysis was used to obtain the visualization of PDB in 3DP-space.ResultsThe 3DP property space groups proteins and protein assemblies according to their 3D-shape similarity, yet shows exquisite ability to distinguish between closely related structures. An interactive website called PDB-Explorer is presented featuring a color-coded interactive map of PDB in 3DP-space. Each pixel of the map contains one or more PDB-entries which are directly visualized as ribbon diagrams when the pixel is selected. The PDB-Explorer website allows performing 3DP-nearest neighbor searches of any PDB-entry or of any structure uploaded as protein-type PDB file. All functionalities on the website are implemented in JavaScript in a platform-independent manner and draw data from a server that is updated daily with the latest PDB additions, ensuring complete and up-to-date coverage. The essentially instantaneous 3DP-similarity search with the PDB-Explorer provides results comparable to those of much slower 3D-alignment algorithms, and automatically clusters proteins from the same superfamilies in tight groups.ConclusionA chemical space classification of PDB based on molecular shape was obtained using a new atom-pair 3D-fingerprint for proteins and implemented in a web-based database exploration tool comprising an interactive color-coded map of the PDB chemical space and a nearest neighbor search tool. The PDB-Explorer website is freely available at www.cheminfo.org/pdbexplorer and represents an unprecedented opportunity to interactively visualize and explore the structural diversity of the PDB.ᅟ Graphical abstractᅟMaps of PDB in 3DP-space color-coded by heavy atom count and shape. Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0776-9) contains supplementary material, which is available to authorized users.
NMR is a mature technique that is well established and adopted in a wide range of research facilities from laboratories to hospitals. This accounts for large amounts of valuable experimental data that may be readily exported into a standard and open format. Yet the publication of these data faces an important issue: Raw data are not made available; instead, the information is slimed down into a string of characters (the list of peaks). Although historical limitations of technology explain this practice, it is not acceptable in the era of Internet. The idea of modernizing the strategy for sharing NMR data is not new, and some repositories exist, but sharing raw data is still not an established practice. Here, we present a powerful toolbox built on recent technologies that runs inside the browser and provides a means to store, share, analyse, and interact with original NMR data. Stored spectra can be streamlined into the publication pipeline, to improve the revision process for instance. The set of tools is still basic but is intended to be extended. The project is open source under the Massachusetts Institute of Technology (MIT) licence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.