A Molecular Features Set (MFS), is a result of a vast diversity of bioinformatics pipelines. The lack of a “gold standard” for most experimental data modalities makes it difficult to provide valid estimation for a particular MFS's quality. Yet, this goal can partially be achieved by analyzing inner-sample Distance Matrices (DM) and their power to distinguish between phenotypes. The quality of a DM can be assessed by summarizing its power to quantify the differences of inner-phenotype and outer-phenotype distances. This estimation of the DM quality can be construed as a measure of the MFS's quality. Here we propose Hobotnica, an approach to estimate MFSs quality by their ability to stratify data, and assign them significance scores, that allow for collating various signatures and comparing their quality for contrasting groups.
Differential methylation (DM) is actively recruited in different types of fundamental and translational studies. Currently, microarray- and NGS-based approaches for methylation analysis are the most widely used with multiple statistical models designed to extract differential methylation signatures. The benchmarking of DM models is challenging due to the absence of gold standard data. In this study, we analyze an extensive number of publicly available NGS and microarray datasets with divergent and widely utilized statistical models and apply the recently suggested and validated rank-statistic-based approach Hobotnica to evaluate the quality of their results. Overall, microarray-based methods demonstrate more robust and convergent results, while NGS-based models are highly dissimilar. Tests on the simulated NGS data tend to overestimate the quality of the DM methods and therefore are recommended for use with caution. Evaluation of the top 10 DMC and top 100 DMC in addition to the not-subset signature also shows more stable results for microarray data. Summing up, given the observed heterogeneity in NGS methylation data, the evaluation of newly generated methylation signatures is a crucial step in DM analysis. The Hobotnica metric is coordinated with previously developed quality metrics and provides a robust, sensitive, and informative estimation of methods’ performance and DM signatures’ quality in the absence of gold standard data solving a long-existing problem in DM analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.