BioSimGrid is a database for biomolecular simulations, or, a "Protein Data Bank extended in time" for molecular dynamics trajectories. We describe the implementation details: architecture, data schema, deposition, and analysis modules. We encourage the simulation community to explore BioSimGrid and work towards a common trajectory exchange format.
Biomolecular computer simulations are now widely used not only in an academic setting to understand the fundamental role of molecular dynamics on biological function, but also in the industrial context to assist in drug design. In this paper, two applications of Grid computing to this area will be outlined. The first, involving the coupling of distributed computing resources to dedicated Beowulf clusters, is targeted at simulating protein conformational change using the Replica Exchange methodology. In the second, the rationale and design of a database of biomolecular simulation trajectories is described. Both applications illustrate the increasingly important role modern computational methods are playing in the life sciences.
In computational biomolecular research, large amounts of simulation data are generated to capture the motion of proteins. These massive simulation data can be analysed in a number of ways to reveal the biochemical properties of the proteins. However, the legacy way of storing these data (usually in the laboratory where the simulations have been run) often hinders a wider sharing and easier cross-comparison of simulation results. The data is commonly encoded in a way specific to the simulation package that produced the data and can only be analysed with tools developed specifically for that simulation package. The BioSimGrid platform seeks to provide a solution to these challenges by exploiting the potential of the Grid in facilitating data sharing. By using BioSimGrid either in a scripting or web environment, users can deposit their data and reuse it for analysis. BioSimGrid tools manage the multiple storage locations transparently to the users and provide a set of retrieval and analysis tools for processing the data in a convenient and efficient manner. This paper details the usage and implementation of BioSimGrid using a combination of commercial databases, the Storage Resource Broker and Python scripts, gluing the building blocks together. It introduces a case study of how BioSimGrid can be used for better storage, retrieval and analysis of biomolecular simulation data.
Abstract:Contemporary structural biology has an increased emphasis on high-throughput methods. Biomolecular simulations can add value to structural biology via the provision of dynamic information. However, at present there are no agreed measures for the quality of biomolecular simulation data. In this Letter, we suggest suitable measures for the quality assurance of molecular dynamics simulations of biomolecules. These measures are designed to be simple, fast, and general. Reporting of these measures in simulation papers should become an expected practice, analogous to the reporting of comparable quality measures in protein crystallography. We wish to solicit views and suggestions from the simulation community on methods to obtain reliability measures from molecular-dynamics trajectories. In a database which provides access to previously obtained simulationssfor example BioSimGrid (http://www.biosimgrid.org/)sthe user needs to be confident that the simulation trajectory is suitable for further investigation. This can be provided by the simulation quality measures which a user would examine prior to more extensive analyses. OVERVIEWFor the past quarter century, biomolecular simulations have been adding value to structural biology via the provision of dynamic information. 1 As genomics move from sequencing to structural and dynamical considerations, and highthroughput technologies advance from crystallography to molecular-dynamics (MD) simulation, this process is occurring with vigor. As the bibliometric data in Figure 1 show, MD simulation of biopolymers is now becoming a routine technique. To help this maturation process, standardized practice should be established in the simulation community, similar to that in crystallography.2,3 It is already regular practice to print quality measures in a formulaic table in published articles reporting crystallographic resultssindeed, it is surprising if such a table is missing, and the referees would readily reject the manuscript.We are hereby initiating a discussion on the appropriate measures of quality and convergence 4 for MD simulation trajectories of biopolymers. The process of calculating these measures is designed to be automated for large numbers of trajectories; hence the set of analyses used for this description should be general, with minimal interaction of a human curator. The scientist can then use these measures, along with sensible comparisons with known experimental data (which we recognize as essential), to decide whether a specific trajectory is suitable for further investigation. Our purpose is to solicit feedback from the simulation community with regard to the analyses we have chosen and to obtain further suggestions. We invite the community to express their views on our choices of measures.We are motivated to do this by our work in building BioSimGrid, 5 a distributed environment for archiving and analyzing biopolymer simulations. Other similar databases are emerging 6 (personal communications with Valerie Daggett and Modesto Orozco, http://mmb.pcb.u...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.