Halid Ziya Yerebakan scite author profile

Halid Ziya Yerebakan

4Publications

57Citation Statements Received

73Citation Statements Given

How they've been cited

How they cite others

Affiliations

Siemens Healthcare (United States), Indiana University – Purdue University Indianapolis

Publications

Order By: Most citations

A non-parametric Bayesian model for joint cell clustering and cluster matching: identification of anomalous sample phenotypes with random effects

et al. 2014

View full text Add to dashboard Cite

BackgroundFlow cytometry (FC)-based computer-aided diagnostics is an emerging technique utilizing modern multiparametric cytometry systems.The major difficulty in using machine-learning approaches for classification of FC data arises from limited access to a wide variety of anomalous samples for training. In consequence, any learning with an abundance of normal cases and a limited set of specific anomalous cases is biased towards the types of anomalies represented in the training set. Such models do not accurately identify anomalies, whether previously known or unknown, that may exist in future samples tested. Although one-class classifiers trained using only normal cases would avoid such a bias, robust sample characterization is critical for a generalizable model. Owing to sample heterogeneity and instrumental variability, arbitrary characterization of samples usually introduces feature noise that may lead to poor predictive performance. Herein, we present a non-parametric Bayesian algorithm called ASPIRE (anomalous sample phenotype identification with random effects) that identifies phenotypic differences across a batch of samples in the presence of random effects. Our approach involves simultaneous clustering of cellular measurements in individual samples and matching of discovered clusters across all samples in order to recover global clusters using probabilistic sampling techniques in a systematic way.ResultsWe demonstrate the performance of the proposed method in identifying anomalous samples in two different FC data sets, one of which represents a set of samples including acute myeloid leukemia (AML) cases, and the other a generic 5-parameter peripheral-blood immunophenotyping. Results are evaluated in terms of the area under the receiver operating characteristics curve (AUC). ASPIRE achieved AUCs of 0.99 and 1.0 on the AML and generic blood immunophenotyping data sets, respectively.ConclusionsThese results demonstrate that anomalous samples can be identified by ASPIRE with almost perfect accuracy without a priori access to samples of anomalous subtypes in the training set. The ASPIRE approach is unique in its ability to form generalizations regarding normal and anomalous states given only very weak assumptions regarding sample characteristics and origin. Thus, ASPIRE could become highly instrumental in providing unique insights about observed biological phenomena in the absence of full information about the investigated samples.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2105-15-314) contains supplementary material, which is available to authorized users.

show abstract

Batch discovery of recurring rare classes toward identifying anomalous samples

Dündar

Yerebakan

Rajwa

2014

View full text Add to dashboard Cite

We present a clustering algorithm for discovering rare yet significant recurring classes across a batch of samples in the presence of random effects. We model each sample data by an infinite mixture of Dirichlet-process Gaussian-mixture models (DPMs) with each DPM representing the noisy realization of its corresponding class distribution in a given sample. We introduce dependencies across multiple samples by placing a global Dirichlet process prior over individual DPMs. This hierarchical prior introduces a sharing mechanism across samples and allows for identifying local realizations of classes across samples. We use collapsed Gibbs sampler for inference to recover local DPMs and identify their class associations. We demonstrate the utility of the proposed algorithm, processing a flow cytometry data set containing two extremely rare cell populations, and report results that significantly outperform competing techniques.The source code of the proposed algorithm is available on the web via the link:

show abstract

Partially collapsed parallel Gibbs sampler for Dirichlet process mixture models

Yerebakan

Dündar

2017

Pattern Recognition Letters

View full text Add to dashboard Cite

Dirichlet Process (DP) is commonly used as a non-parametric prior on mixture models. It has adaptive model selection capability which is useful in clustering applications. Although exact inference is not tractable for this prior, Markov Chain Monte Carlo (MCMC) samplers have been used to approximate the target posterior distribution. These samplers often do not scale well. Thus, recent studies focused on improving run-time efficiency through parallelization. In this paper, we introduce a new sampling method for DP by combining Chinese Restaurant Process (CRP) with stick-breaking construction allowing for parallelization through conditional independence at the data point level. Stick breaking part uses an uncollapsed sampler providing a high level of parallelization while the CRP part uses collapsed sampler allowing more accurate clustering. We show that this partially collapsed Gibbs sampler has significant advantages over the collapsed-only version in terms of scalability. We also provide results on real-world data sets that favorably compares the proposed inference algorithm against a recently introduced parallel Dirichlet Process samplers in terms of F1 scores while maintaining a comparable run-time performance.

show abstract

False positive reduction of vasculature for pulmonary nodule detection

Hansen

Zhao

Yerebakan

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.