2023
DOI: 10.1101/2023.05.02.538537
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MDverse: Shedding Light on the Dark Matter of Molecular Dynamics Simulations

Abstract: The rise of open science and the absence of a global dedicated data repository for molecular dynamics (MD) simulations has led to the accumulation of MD files in generalist data repositories, constituting thedark matter of MD- data that is technically accessible, but neither indexed, curated, or easily searchable. Leveraging an original search strategy, we found and indexed about 250,000 files and 2,000 datasets from Zenodo, Figshare and Open Science Framework. With a focus on files produced by the Gromacs MD … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(12 citation statements)
references
References 113 publications
0
6
0
Order By: Relevance
“…The VIAMD application is a promising solution for improving the efficiency and accuracy of MD analysis and has the potential to advance research in many scientific disciplines. We hope that VIAMD can be a tool for dissemination since more and more MD trajectories are available online in repositories, and there is an initiative for a search engine prototype to explore collected MD data . To reach a larger audience of computational chemists, we plan to extend the type of data that could be analyzed with VIAMD and develop new functionalities in the future.…”
Section: Discussionmentioning
confidence: 99%
“…The VIAMD application is a promising solution for improving the efficiency and accuracy of MD analysis and has the potential to advance research in many scientific disciplines. We hope that VIAMD can be a tool for dissemination since more and more MD trajectories are available online in repositories, and there is an initiative for a search engine prototype to explore collected MD data . To reach a larger audience of computational chemists, we plan to extend the type of data that could be analyzed with VIAMD and develop new functionalities in the future.…”
Section: Discussionmentioning
confidence: 99%
“…These efforts go hand in hand with sharing of molecular dynamics data ( Abraham et al, 2019 ; Hildebrand et al, 2019 ; Hospital et al, 2020 ), for which protocols have been designed ( Tiemann et al, 2017 ; Pacheco et al, 2019 ; Amaro and Mulholland, 2020 ; Kampfrath et al, 2022 ; Tiemann et al, 2023 ). Indeed, in 2017, ( Tiemann et al, 2017 ), developed MDsrv, a tool designed to visualize interactively MD trajectories within web browsers, without the need for specialized knowledge in MD software.…”
Section: Scaling Up To Simulation Ensembles and Data Sharingmentioning
confidence: 99%
“…In 2022, this tool was enhanced by ( Kampfrath et al, 2022 ) to simplify the process of uploading and sharing MD trajectories, while also improving their online streaming and analysis capacity. In 2023, indexed approximately 250,000 files and 2,000 datasets from Zenodo, Figshare and Open Science Framework ( Tiemann et al, 2023 ; Pacheco et al, 2019 ) developed PCAViz, a freely available toolkit designed for sharing and displaying MD trajectories directly through web browsers. PCAViz consists of two main components: the PCAViz Compressor, responsible for compressing and storing simulation data efficiently, and the PCAViz Interpreter, responsible for decompressing the data within users’ web browsers and seamlessly integrating it with various browser-based molecular visualization libraries such as 3Dmol.js, NGL Viewer, and others.…”
Section: Scaling Up To Simulation Ensembles and Data Sharingmentioning
confidence: 99%
“…The application of the FAIR principles to the publication of biomolecular simulation results is thus an important first step toward building relevant data sets and benchmarks and will encourage further development of the field by greatly increasing data availability. The initiative of publishing the MD trajectories is also called upon by Tiemann et al, who performed an extensive MD data mining exercise and demonstrated the utility of publicly accessible data . The prioritization of building data sets for specific proteins or protein families appears to be a reasonable next step, followed by exploring the possibility of transferring knowledge between different protein families …”
Section: Major Gaps In the State Of The Artmentioning
confidence: 99%
“…The initiative of publishing the MD trajectories is also called upon by Tiemann et al, who performed an extensive MD data mining exercise and demonstrated the utility of publicly accessible data. 290 The prioritization of building data sets for specific proteins or protein families appears to be a reasonable next step, followed by exploring the possibility of transferring knowledge between different protein families. 291 …”
Section: Major Gaps In the State Of The Artmentioning
confidence: 99%