Abstract:BackgroundCompared to engineering or physics problems, dynamical models in quantitative biology typically depend on a relatively large number of parameters. Progress in developing mathematics to manipulate such multi-parameter models and so enable their efficient interplay with experiments has been slow. Existing solutions are significantly limited by model size.ResultsIn order to simplify analysis of multi-parameter models a method for clustering of model parameters is proposed. It is based on a derived stati… Show more
“…Comparison of the response trajectories corresponding to 1 and 100 ng/ml of TNF- α , Fig 2A, indicates the emergence of a fraction of cells that exhibit a second peak in response to the highest considered concentration. The second peak is reminiscent of the oscillatory behavior that is typical for the NF- κ B pathway when exposed to continuous, as opposed to 5 minutes, stimulation [44, 48–50]. The second peak in response trajectories carries some information about TNF- α and, therefore, contributes to the second peak of information transfer.…”
Mathematical methods of information theory appear to provide a useful language to describe how stimuli are encoded in activities of signaling effectors. Exploring the information-theoretic perspective, however, remains conceptually, experimentally and computationally challenging. Specifically, existing computational tools enable efficient analysis of relatively simple systems, usually with one input and output only. Moreover, their robust and readily applicable implementations are missing. Here, we propose a novel algorithm, SLEMI—statistical learning based estimation of mutual information, to analyze signaling systems with high-dimensional outputs and a large number of input values. Our approach is efficient in terms of computational time as well as sample size needed for accurate estimation. Analysis of the NF-
κ
B single—cell signaling responses to TNF-
α
reveals that NF-
κ
B signaling dynamics improves discrimination of high concentrations of TNF-
α
with a relatively modest impact on discrimination of low concentrations. Provided R-package allows the approach to be used by computational biologists with only elementary knowledge of information theory.
“…Comparison of the response trajectories corresponding to 1 and 100 ng/ml of TNF- α , Fig 2A, indicates the emergence of a fraction of cells that exhibit a second peak in response to the highest considered concentration. The second peak is reminiscent of the oscillatory behavior that is typical for the NF- κ B pathway when exposed to continuous, as opposed to 5 minutes, stimulation [44, 48–50]. The second peak in response trajectories carries some information about TNF- α and, therefore, contributes to the second peak of information transfer.…”
Mathematical methods of information theory appear to provide a useful language to describe how stimuli are encoded in activities of signaling effectors. Exploring the information-theoretic perspective, however, remains conceptually, experimentally and computationally challenging. Specifically, existing computational tools enable efficient analysis of relatively simple systems, usually with one input and output only. Moreover, their robust and readily applicable implementations are missing. Here, we propose a novel algorithm, SLEMI—statistical learning based estimation of mutual information, to analyze signaling systems with high-dimensional outputs and a large number of input values. Our approach is efficient in terms of computational time as well as sample size needed for accurate estimation. Analysis of the NF-
κ
B single—cell signaling responses to TNF-
α
reveals that NF-
κ
B signaling dynamics improves discrimination of high concentrations of TNF-
α
with a relatively modest impact on discrimination of low concentrations. Provided R-package allows the approach to be used by computational biologists with only elementary knowledge of information theory.
“…Additionally, both methods differ in the optimization technique: we use Variable Neighbourhood Search, which has better scalability than the genetic algorithm chosen in [ 54 ]. Recently, Nienałtowski et al [ 58 ] have proposed a method for finding clusters of correlated parameters using so-called canonical correlation analysis (CCA). CCA is an extension of Pearson correlation for measuring multidimensional correlations between groups of parameters.…”
BackgroundKinetic models of biochemical systems usually consist of ordinary differential equations that have many unknown parameters. Some of these parameters are often practically unidentifiable, that is, their values cannot be uniquely determined from the available data. Possible causes are lack of influence on the measured outputs, interdependence among parameters, and poor data quality. Uncorrelated parameters can be seen as the key tuning knobs of a predictive model. Therefore, before attempting to perform parameter estimation (model calibration) it is important to characterize the subset(s) of identifiable parameters and their interplay. Once this is achieved, it is still necessary to perform parameter estimation, which poses additional challenges.MethodsWe present a methodology that (i) detects high-order relationships among parameters, and (ii) visualizes the results to facilitate further analysis. We use a collinearity index to quantify the correlation between parameters in a group in a computationally efficient way. Then we apply integer optimization to find the largest groups of uncorrelated parameters. We also use the collinearity index to identify small groups of highly correlated parameters. The results files can be visualized using Cytoscape, showing the identifiable and non-identifiable groups of parameters together with the model structure in the same graph.ResultsOur contributions alleviate the difficulties that appear at different stages of the identifiability analysis and parameter estimation process. We show how to combine global optimization and regularization techniques for calibrating medium and large scale biological models with moderate computation times. Then we evaluate the practical identifiability of the estimated parameters using the proposed methodology. The identifiability analysis techniques are implemented as a MATLAB toolbox called VisId, which is freely available as open source from GitHub (https://github.com/gabora/visid).ConclusionsOur approach is geared towards scalability. It enables the practical identifiability analysis of dynamic models of large size, and accelerates their calibration. The visualization tool allows modellers to detect parts that are problematic and need refinement or reformulation, and provides experimentalists with information that can be helpful in the design of new experiments.Electronic supplementary materialThe online version of this article (doi:10.1186/s12918-017-0428-y) contains supplementary material, which is available to authorized users.
“…Only parameters which are consistent with measured data can be selected and jointly estimated (Hasenauer et al, 2010). Parameter clustering can also improve model tractability and identifiability, since changes in some parameters could be compensated by changes in other parameters (Nienaltowski et al, 2015). Grouping of parameters to elucidate dynamics of genetic circuit is assumed in (Atitey et al, 2019).…”
The key processes in biological and chemical systems are described by networks of chemical reactions. From molecular biology to biotechnology applications, computational models of reaction networks are used extensively to elucidate their non-linear dynamics. The model dynamics are crucially dependent on the parameter values which are often estimated from observations. Over the past decade, the interest in parameter and state estimation in models of (bio-) chemical reaction networks (BRNs) grew considerably. The related inference problems are also encountered in many other tasks including model calibration, discrimination, identifiability, and checking, and optimum experiment design, sensitivity analysis, and bifurcation analysis. The aim of this review paper is to examine the developments in literature to understand what BRN models are commonly used, and for what inference tasks and inference methods. The initial collection of about 700 documents concerning estimation problems in BRNs excluding books and textbooks in computational biology and chemistry were screened to select over 270 research papers and 20 graduate research theses. The paper selection was facilitated by text mining scripts to automate the search for relevant keywords and terms. The outcomes are presented in tables revealing the levels of interest in different inference tasks and methods for given models in the literature as well as the research trends are uncovered. Our findings indicate that many combinations of models, tasks and methods are still relatively unexplored, and there are many new research opportunities to explore combinations that have not been considered—perhaps for good reasons. The most common models of BRNs in literature involve differential equations, Markov processes, mass action kinetics, and state space representations whereas the most common tasks are the parameter inference and model identification. The most common methods in literature are Bayesian analysis, Monte Carlo sampling strategies, and model fitting to data using evolutionary algorithms. The new research problems which cannot be directly deduced from the text mining data are also discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.