Chul Moon scite author profile

2019

The fingerprint classification problem is to sort fingerprints into pre-determined groups, such as arch, loop, and whorl. It was asserted in the literature that minutiae points, which are commonly used for fingerprint matching, are not useful for classification. We show that, to the contrary, near state-of-the-art classification accuracy rates can be achieved when applying topological data analysis (TDA) to 3-dimensional point clouds of oriented minutiae points. We also apply TDA to fingerprint ink-roll images, which yields a lower accuracy rate but still shows promise, particularly since the only preprocessing is cropping; moreover, combining the two approaches outperforms each one individually. These methods use supervised learning applied to persistent homology and allow us to explore feature selection on barcodes, an important topic at the interface between TDA and machine learning. We test our classification algorithms on the NIST fingerprint database SD-27.

Statistical Inference Over Persistent Homology Predicts Fluid Flow in Porous Media

Water Resources Research

Mitchell

Heath

et al. 2019

We statistically infer fluid flow and transport properties of porous materials based on their geometry and connectivity, without the need for detailed We summarize structure by persistent homology and then determines the similarity of structures using image analysis and statistics. Longer term, this may enable quick and automated categorization of rocks into known archetypes. We first compute persistent homology of binarized 3D images of material subvolume samples. The persistence parameter is the signed Euclidean distance from inferred material interfaces, which captures the distribution of sizes of pores and grains. Each persistence diagram is converted into an image vector. We infer structural similarity by calculating image similarity. For each image vector, we compute principal components to extract features. We fit statistical models to features estimates material permeability, tortuosity, and anisotropy. We develop a Structural SIMilarity index to determine statistical representative elementary volumes.

Persistence Terrace for Topological Inference of Point Cloud Data

Journal of Computational and Graphical Statistics

Giansiracusa

Lazar

2018

Topological data analysis (TDA) is a rapidly developing collection of methods for studying the shape of point cloud and other data types. One popular approach, designed to be robust to noise and outliers, is to first use a smoothing function to convert the point cloud into a manifold and then apply persistent homology to a Morse filtration. A significant challenge is that this smoothing process involves the choice of a parameter and persistent homology is highly sensitive to that choice; moreover, important scale information is lost. We propose a novel topological summary plot, called a persistence terrace, that incorporates a wide range of smoothing parameters and is robust, multi-scale, and parameter-free. This plot allows one to isolate distinct topological signals that may have merged for any fixed value of the smoothing parameter, and it also allows one to infer the size and point density of the topological features. We illustrate our method in some simple settings where noise is a serious issue for existing frameworks and then we apply it to a real data set by counting muscle fibers in a cross-sectional image.

Using Persistent Homology Topological Features to Characterize Medical Images: Case Studies on Lung and Brain Cancers

Li²,

Xiao³

2020

Preprint

SAFARI: shape analysis for AI-segmented images

et al. 2022

Background Recent developments to segment and characterize the regions of interest (ROI) within medical images have led to promising shape analysis studies. However, the procedures to analyze the ROI are arbitrary and vary by study. A tool to translate the ROI to analyzable shape representations and features is greatly needed. Results We developed SAFARI (shape analysis for AI-segmented images), an open-source package with a user-friendly online tool kit for ROI labelling and shape feature extraction of segmented maps, provided by AI-algorithms or manual segmentation. We demonstrated that half of the shape features extracted by SAFARI were significantly associated with survival outcomes in a case study on 143 consecutive patients with stage I–IV lung cancer and another case study on 61 glioblastoma patients. Conclusions SAFARI is an efficient and easy-to-use toolkit for segmenting and analyzing ROI in medical images. It can be downloaded from the comprehensive R archive network (CRAN) and accessed at https://lce.biohpc.swmed.edu/safari/.

Empirical likelihood inference for area under the receiver operating characteristic curve using ranked set samples

Pharmaceutical Statistics

Wang

Lim

2022

The area under a receiver operating characteristic curve (AUC) is a useful tool to assess the performance of continuous‐scale diagnostic tests on binary classification. In this article, we propose an empirical likelihood (EL) method to construct confidence intervals for the AUC from data collected by ranked set sampling (RSS). The proposed EL‐based method enables inferences without assumptions required in existing nonparametric methods and takes advantage of the sampling efficiency of RSS. We show that for both balanced and unbalanced RSS, the EL‐based point estimate is the Mann–Whitney statistic, and confidence intervals can be obtained from a scaled chi‐square distribution. Simulation studies and two case studies on diabetes and chronic kidney disease data suggest that using the proposed method and RSS enables more efficient inference on the AUC.

Bayesian Landmark-based Shape Analysis of Tumor Pathology Images

Zhang¹,

Xiao²,

Moon³

et al. 2020

Preprint

Medical imaging is a form of technology that has revolutionized the medical field in the past century. In addition to radiology imaging of tumor tissues, digital pathology imaging, which captures histological details in high spatial resolution, is fast becoming a routine clinical procedure for cancer diagnosis support and treatment planning. Recent developments in deep-learning methods facilitate the segmentation of tumor regions at almost the cellular level from digital pathology images. The traditional shape features that were developed for characterizing tumor boundary roughness in radiology are not applicable. Reliable statistical approaches to modeling tumor shape in pathology images are in urgent need. In this paper, we consider the problem of modeling a tumor boundary with a closed polygonal chain. A Bayesian landmark-based shape analysis (BayesLASA) model is proposed to partition the polygonal chain into mutually exclusive segments to quantify the boundary roughness piecewise. Our fully Bayesian inference framework provides uncertainty estimates of both the number and locations of landmarks. The BayesLASA outperforms a recently developed landmark detection model for planar elastic curves in terms of accuracy and efficiency. We demonstrate how this model-based analysis can lead to sharper inferences than ordinary approaches through a case study on the 246 pathology images from 143 non-small cell lung cancer patients. The case study shows that the heterogeneity of tumor boundary roughness predicts patient prognosis (p-value < 0.001). This statistical methodology not only presents a new model for characterizing a digitized object's shape features by using its landmarks, but also provides a new perspective for understanding the role of tumor surface in cancer progression.

Numerical Heat Transfer, Part B: Fundamentals

Applications of a Flowfield-Dependent Mixed Explicit-Implicit (Fdmei) Method to Heat and Fluid Dynamics Problems

Moon¹

2001