Reproducibility of scientific results is a key element of science and credibility. The lack of reproducibility across many scientific fields has emerged as an important concern. In this piece, we assess mathematical model reproducibility and propose a scorecard for improving reproducibility in this field.
The European Bioinformatics Institute (EMBL-EBI) maintains a comprehensive range of freely available and up-to-date molecular data resources, which includes over 40 resources covering every major data type in the life sciences. This year's service update for EMBL-EBI includes new resources, PGS Catalog and AlphaFold DB, and updates on existing resources, including the COVID-19 Data Platform, trRosetta and RoseTTAfold models introduced in Pfam and InterPro, and the launch of Genome Integrations with Function and Sequence by UniProt and Ensembl. Furthermore, we highlight projects through which EMBL-EBI has contributed to the development of community-driven data standards and guidelines, including the Recommended Metadata for Biological Images (REMBI), and the BioModels Reproducibility Scorecard. Training is one of EMBL-EBI’s core missions and a key component of the provision of bioinformatics services to users: this year's update includes many of the improvements that have been developed to EMBL-EBI’s online training offering.
The reproducibility crisis has emerged as an important concern across many fields of science including life science, since many published results failed to reproduce. Systems biology modelling, which involves mathematical representation of biological processes to study complex system behaviour, was expected to be least affected by this crisis. While lack of reproducibility of experimental results and computational analysis could be a repercussion of several compounded factors, it was not fully understood why systems biology models with well defined mathematical expressions fail to reproduce and how prevalent it is. Hence, we systematically attempted to reproduce 455 kinetic models of biological processes published in peer-reviewed research articles from 152 journals; which is collectively a work of about 1400 scientists from 49 countries. Our investigation revealed that about half (49%) of the models could not be reproduced using the information provided in the published manuscripts. With further effort, an additional 12% of the models could be reproduced either by empirical correction or support from authors. The other 37% remained non-reproducible models due to missing parameter values, missing initial concentration, inconsistent model structure, or a combination of these factors. Among the corresponding authors of the non-reproducible model we contacted, less than 30% responded. Our analysis revealed that models published in journals across several fields of life science failed to reproduce, revealing a common problem in the peer-review process. Hence, we propose an 8-point reproducibility scorecard that can be used by authors, reviewers and journal editors to assess each model and address the reproducibility crisis.
During the COVID-19 pandemic, mathematical modeling of disease transmission has become a cornerstone of key state decisions. To advance the state-of-the-art host viral modeling to handle future pandemics, many scientists working on related issues assembled to discuss the topics. These discussions exposed the reproducibility crisis that leads to inability to reuse and integrate models. This document summarizes these discussions, presents difficulties, and mentions existing efforts towards future solutions that will allow future model utility and integration. We argue that without addressing these challenges, scientists will have diminished ability to build, disseminate, and implement high-impact multi-scale modeling that is needed to understand the health crises we face.
As an alternative to one drug-one target approaches, systems biology methods can provide a deeper insight into the holistic effects of drugs. Network-based approaches are tools of systems biology, that can represent valuable methods for visualizing and analysing drug-protein and protein–protein interactions. In this study, a KNIME workflow is presented which connects drugs to causal target proteins and target proteins to their causal protein interactors. With the collected data, networks can be constructed for visualizing and interpreting the connections. The last part of the workflow provides a topological enrichment test for identifying relevant pathways and processes connected to the submitted data. The workflow is based on openly available databases and their web services. As a case study, compounds of DILIRank were analysed. DILIRank is the benchmark dataset for Drug-Induced Liver Injury by the FDA, where compounds are categorized by their likeliness of causing DILI. The study includes the drugs that are most likely to cause DILI (“mostDILI”) and the ones that are not likely to cause DILI (“noDILI”). After selecting the compounds of interest, down- and upregulated proteins connected to the mostDILI group were identified; furthermore, a liver-specific subset of those was created. The downregulated sub-list had considerably more entries, therefore, network and causal interactome were constructed and topological pathway enrichment analysis was performed with this list. The workflow identified proteins such as Prostaglandin G7H synthase 1 and UDP-glucuronosyltransferase 1A9 as key participants in the potential toxic events disclosing the possible mode of action. The topological network analysis resulted in pathways such as recycling of bile acids and salts and glucuronidation, indicating their involvement in DILI. The KNIME pipeline was built to support target and network-based approaches to analyse any sets of drug data and identify their target proteins, mode of actions and processes they are involved in. The fragments of the pipeline can be used separately or can be combined as required.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.