Highly resolved spatial data of complex systems encode rich and nonlinear information. Quantification of heterogeneous and noisy data—often with outliers, artifacts, and mislabeled points—such as those from tissues, remains a challenge. The mathematical field that extracts information from the shape of data, topological data analysis (TDA), has expanded its capability for analyzing real-world datasets in recent years by extending theory, statistics, and computation. An extension to the standard theory to handle heterogeneous data is multiparameter persistent homology (MPH). Here we provide an application of MPH landscapes, a statistical tool with theoretical underpinnings. MPH landscapes, computed for (noisy) data from agent-based model simulations of immune cells infiltrating into a spheroid, are shown to surpass existing spatial statistics and one-parameter persistent homology. We then apply MPH landscapes to study immune cell location in digital histology images from head and neck cancer. We quantify intratumoral immune cells and find that infiltrating regulatory T cells have more prominent voids in their spatial patterns than macrophages. Finally, we consider how TDA can integrate and interrogate data of different types and scales, e.g., immune cell locations and regions with differing levels of oxygenation. This work highlights the power of MPH landscapes for quantifying, characterizing, and comparing features within the tumor microenvironment in synthetic and real datasets.
Understanding how knotted proteins fold is a challenging problem in biology. Researchers have proposed several models for their folding pathways, based on theory, simulations and experiments. The geometry of proteins with the same knot type can vary substantially and recent simulations reveal different folding behaviour for deeply and shallow knotted proteins. We analyse proteins forming open-ended trefoil knots by introducing a topologically inspired statistical metric that measures their entanglement. By looking directly at the geometry and topology of their native states, we are able to probe different folding pathways for such proteins. In particular, the folding pathway of shallow knotted carbonic anhydrases involves the creation of a double-looped structure, contrary to what has been observed for other knotted trefoil proteins. We validate this with Molecular Dynamics simulations. By leveraging the geometry and local symmetries of knotted proteins’ native states, we provide the first numerical evidence of a double-loop folding mechanism in trefoil proteins.
Let M be a compact, unit volume, Riemannian manifold with boundary. We study the homology of a random Čech-complex generated by a homogeneous Poisson process in M. Our main results are two asymptotic threshold formulas, an upper threshold above which the Čech complex recovers the kth homology of M with high probability, and a lower threshold below which it almost certainly does not. These thresholds share the same leading term. This extends work of Bobrowski-Weinberger and Bobrowski-Oliveira who establish similar formulas when M has no boundary. The cases with and without boundary differ: the corresponding common leading terms for the upper and lower thresholds differ being log(n) when M is closed and (2−2∕𝑑) log(n) when M has boundary; here n is the expected number of sample points. Our analysis identifies a special type of homological cycle occurring close to the boundary.
An important problem in the field of Topological Data Analysis is defining topological summaries which can be combined with traditional data analytic tools. In recent work Bubenik introduced the persistence landscape, a stable representation of persistence diagrams amenable to statistical analysis and machine learning tools. In this paper we generalise the persistence landscape to multiparameter persistence modules providing a stable representation of the rank invariant. We show that multiparameter landscapes are stable with respect to the interleaving distance and persistence weighted Wasserstein distance, and that the collection of multiparameter landscapes faithfully represents the rank invariant. Finally we provide example calculations and statistical tests to demonstrate a range of potential applications and how one can interpret the landscapes associated to a multiparameter module.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.