Veronika Strnadová-Neeley scite author profile

Veronika Strnadová-Neeley

3Publications

3Citation Statements Received

85Citation Statements Given

How they've been cited

How they cite others

Affiliations

Montana State University

Publications

Order By: Most citations

Efficient data reduction for large-scale genetic mapping

Strnadová-Neeley¹,

Buluç

Chapman

et al. 2015

View full text Add to dashboard Cite

We present a fast and accurate algorithm for reducing largescale genetic marker data to a smaller, less noisy, and more complete set of bins, representing uniquely identifiable locations on a chromosome. Our experimental results on real and synthetic data show that our algorithm runs in nearlinear time, allowing for the analysis of millions of markers. Our algorithm reduces the problem scale while preserving accuracy, making it feasible to use existing genetic mapping tools without resorting to complex, time-intensive preprocessing methods to filter or sample the original data set. Additionally, our approach also decreases the uncertainty in genotype calls, improving the quality of the data. Preliminary results demonstrate that existing methods for marker ordering designed for the small scale settings perform with equivalent accuracy when given our reduced bin set as input.

show abstract

Geodesic Forests

Madhyastha

Strnadová-Neeley

et al. 2020

View full text Add to dashboard Cite

Together with the curse of dimensionality, nonlinear dependencies in large data sets persist as major challenges in data mining tasks. A reliable way to accurately preserve nonlinear structure is to compute geodesic distances between data points. Manifold learning methods, such as Isomap, aim to preserve geodesic distances in a Riemannian manifold. However, as manifold learning algorithms operate on the ambient dimensionality of the data, the essential step of geodesic distance computation is sensitive to high-dimensional noise. Therefore, a direct application of these algorithms to highdimensional, noisy data often yields unsatisfactory results and does not accurately capture nonlinear structure.We propose an unsupervised random forest approach called geodesic forests (GF) to geodesic distance estimation in linear and nonlinear manifolds with noise. GF operates on low-dimensional sparse linear combinations of features, rather than the full observed dimensionality. To choose the optimal split in a computationally efficient fashion, we developed Fast-BIC, a fast Bayesian Information Criterion statistic for Gaussian mixture models. We additionally propose geodesic precision and geodesic recall as novel evaluation metrics that quantify how well the geodesic distances of a latent manifold are preserved. Empirical results on simulated and real data demonstrate that GF is robust to high-dimensional noise, whereas other methods, such as Isomap, UMAP, and FLANN, quickly deteriorate in such settings. Notably, GF is able to estimate geodesic distances better than other approaches on a real connectome dataset.

show abstract

Improved Subspace K-Means Performance via a Randomized Matrix Decomposition

Vannoy

Senecal

Strnadová-Neeley

2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.