We present the Pareto task inference method (ParTI; http://www.weizmann.ac.il/mcb/UriAlon/download/ParTI) for inferring biological tasks from high-dimensional biological data. Data are described as a polytope, and features maximally enriched closest to the vertices (or archetypes) allow identification of the tasks the vertices represent. We demonstrate that human breast tumors and mouse tissues are well described by tetrahedrons in gene expression space, with specific tumor types and biological functions enriched at each of the vertices, suggesting four key tasks.
Recent advances have enabled powerful methods to sort tumors into prognosis and treatment groups. We are still missing, however, a general theoretical framework to understand the vast diversity of tumor gene expression and mutations. Here we present a framework based on multi-task evolution theory, using the fact that tumors need to perform multiple tasks that contribute to their fitness. We find that trade-offs between tasks constrain tumor gene-expression to a continuum bounded by a polyhedron whose vertices are gene-expression profiles, each specializing in one task. We find five universal cancer tasks across tissue-types: cell-division, biomass and energy, lipogenesis, immune-interaction and invasion and tissue-remodeling. Tumors that specialize in a task are sensitive to drugs that interfere with this task. Driver, but not passenger, mutations tune gene-expression towards specialization in specific tasks. This approach can integrate additional types of molecular data into a framework of tumor diversity grounded in evolutionary theory.
There is a revolution in the ability to analyze gene expression of single cells in a tissue. To understand this data we must comprehend how cells are distributed in a high-dimensional gene expression space. One open question is whether cell types form discrete clusters or whether gene expression forms a continuum of states. If such a continuum exists, what is its geometry? Recent theory on evolutionary trade-offs suggests that cells that need to perform multiple tasks are arranged in a polygon or polyhedron (line, triangle, tetrahedron and so on, generally called polytopes) in gene expression space, whose vertices are the expression profiles optimal for each task. Here, we analyze single-cell data from human and mouse tissues profiled using a variety of single-cell technologies. We fit the data to shapes with different numbers of vertices, compute their statistical significance, and infer their tasks. We find cases in which single cells fill out a continuum of expression states within a polyhedron. This occurs in intestinal progenitor cells, which fill out a tetrahedron in gene expression space. The four vertices of this tetrahedron are each enriched with genes for a specific task related to stemness and early differentiation. A polyhedral continuum of states is also found in spleen dendritic cells, known to perform multiple immune tasks: cells fill out a tetrahedron whose vertices correspond to key tasks related to maturation, pathogen sensing and communication with lymphocytes. A mixture of continuum-like distributions and discrete clusters is found in other cell types, including bone marrow and differentiated intestinal crypt cells. This approach can be used to understand the geometry and biological tasks of a wide range of single-cell datasets. The present results suggest that the concept of cell type may be expanded. In addition to discreet clusters in gene-expression space, we suggest a new possibility: a continuum of states within a polyhedron, in which the vertices represent specialists at key tasks.
Evolution repeatedly converges on only a few regulatory circuit designs that achieve a given function. This simplicity helps us understand biological networks. However, why so few circuits are rediscovered by evolution is unclear. We address this question for the case of fold-change detection (FCD): a response to relative changes of input rather than absolute changes. Two types of FCD circuits recur in biological systems-the incoherent feedforward and non-linear integral-feedback loops. We performed an analytical screen of all three-node circuits in a class comprising ∼500,000 topologies. We find that FCD is rare, but still there are hundreds of FCD topologies. The two experimentally observed circuits are among the very few minimal circuits that optimally trade off speed, noise resistance, and response amplitude. This suggests a way to understand why evolution converges on only few topologies for a given function and provides FCD designs for synthetic construction and future discovery.
Biological regulatory systems face a fundamental tradeoff: they must be effective but at the same time also economical. For example, regulatory systems that are designed to repair damage must be effective in reducing damage, but economical in not making too many repair proteins because making excessive proteins carries a fitness cost to the cell, called protein burden. In order to see how biological systems compromise between the two tasks of effectiveness and economy, we applied an approach from economics and engineering called Pareto optimality. This approach allows calculating the best-compromise systems that optimally combine the two tasks. We used a simple and general model for regulation, known as integral feedback, and showed that best-compromise systems have particular combinations of biochemical parameters that control the response rate and basal level. We find that the optimal systems fall on a curve in parameter space. Due to this feature, even if one is able to measure only a small fraction of the system's parameters, one can infer the rest. We applied this approach to estimate parameters in three biological systems: response to heat shock and response to DNA damage in bacteria, and calcium homeostasis in mammals.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.