Dynamical processes in biology are studied using an ever-increasing number of techniques, each of which brings out unique features of the system. One of the current challenges is to develop systematic approaches for fusing heterogeneous datasets into an integrated view of multivariable dynamics. We demonstrate that heterogeneous data fusion can be successfully implemented within a semi-supervised learning framework that exploits the intrinsic geometry of high-dimensional datasets. We illustrate our approach using a dataset from studies of pattern formation in Drosophila. The result is a continuous trajectory that reveals the joint dynamics of gene expression, subcellular protein localization, protein phosphorylation, and tissue morphogenesis. Our approach can be readily adapted to other imaging modalities and forms a starting point for further steps of data analytics and modeling of biological dynamics.
Author summaryA wide range of problems in biology require analysis of multivariable dynamics in space and time. As a rule, the multiscale nature and complexity of real systems precludes simultaneous monitoring of all the relevant variables, and multivariable dynamics must be synthesized from partial views provided by different experimental techniques. We present a formal framework for accomplishing this task in the context of imaging studies of pattern formation in developing tissues.