Diem-Trang Tran scite author profile

Truong

2013

J Comput Aided Mol Des

Drug binding and unbinding are transient processes which are hardly observed by experiment and difficult to analyze by computational techniques. In this paper, we employed a cost-effective method called "pathway docking" in which molecular docking was used to screen ligand-receptor binding free energy surface to reveal possible paths of ligand approaching protein binding pocket. A case study was applied on oseltamivir, the key drug against influenza a virus. The equilibrium pathways identified by this method are found to be similar to those identified in prior studies using highly expensive computational approaches.

Cardioinformatics: the nexus of bioinformatics and precision cardiology

Khomtchouk

Vand³

et al. 2019

Cardiovascular disease (CVD) is the leading cause of death worldwide, causing over 17 million deaths per year, which outpaces global cancer mortality rates. Despite these sobering statistics, most bioinformatics and computational biology research and funding to date has been concentrated predominantly on cancer research, with a relatively modest footprint in CVD. In this paper, we review the existing literary landscape and critically assess the unmet need to further develop an emerging field at the multidisciplinary interface of bioinformatics and precision cardiovascular medicine, which we refer to as ‘cardioinformatics’.

A graph-based algorithm for RNA-seq data normalization

et al. 2020

The use of RNA-sequencing has garnered much attention in recent years for characterizing and understanding various biological systems. However, it remains a major challenge to gain insights from a large number of RNA-seq experiments collectively, due to the normalization problem. Normalization has been challenging due to an inherent circularity, requiring that RNA-seq data be normalized before any pattern of differential (or non-differential) expression can be ascertained; meanwhile, the prior knowledge of non-differential transcripts is crucial to the normalization process. Some methods have successfully overcome this problem by the assumption that most transcripts are not differentially expressed. However, when RNA-seq profiles become more abundant and heterogeneous, this assumption fails to hold, leading to erroneous normalization. We present a normalization procedure that does not rely on this assumption, nor prior knowledge about the reference transcripts. This algorithm is based on a graph constructed from intrinsic correlations among RNA-seq transcripts and seeks to identify a set of densely connected vertices as references. Application of this algorithm on our synthesized validation data showed that it could recover the reference transcripts with high precision, thus resulting in high-quality normalization. On a realistic data set from the ENCODE project, this algorithm gave good results and could finish in a reasonable time. These preliminary results imply that we may be able to break the long persisting circularity problem in RNA-seq normalization.

HeartBioPortal: an internet-of-omics for human cardiovascular disease data

Khomtchouk

Vand²,

Koehler³

et al. 2018

Preprint

Cardiovascular disease (CVD) is the leading cause of death worldwide, causing over 17M deaths per year, which outpaces global cancer mortality rates. Despite these sobering statistics, the state-of-the-art in computational infrastructure to study datasets associated with CVD has lagged far behind public resources widely available in the oncology field, where improved data science and visualization methods have led to the development of large-scale cancer genomics resources like MSKCC's cBioPortal or NCI's Genomic Data Commons (GDC) Portal. Developing a similar user-friendly computational platform could significantly lower the barriers between complex CVD data and researchers who want rapid, intuitive, and high-quality visual access to molecular profiles and clinical attributes from existing CVD projects. Here we present HeartBioPortal: a publicly available web application that provides intuitive visualization, analysis, and downloads of large-scale CVD data currently focused on gene expression, genetic association, and ancestry information. By democratizing access to anonymized CVD data, HeartBioPortal's aim is to integrate relevant omics and clinical information across the biological dataverse to support CVD clinicians and researchers.

cdev: a ground-truth based measure to evaluate RNA-seq normalization performance

Might

2021

Normalization of RNA-seq data has been an active area of research since the problem was first recognized a decade ago. Despite the active development of new normalizers, their performance measures have been given little attention. To evaluate normalizers, researchers have been relying on ad hoc measures, most of which are either qualitative, potentially biased, or easily confounded by parametric choices of downstream analysis. We propose a metric called condition-number based deviation, or cdev, to quantify normalization success. cdev measures how much an expression matrix differs from another. If a ground truth normalization is given, cdev can then be used to evaluate the performance of normalizers. To establish experimental ground truth, we compiled an extensive set of public RNA-seq assays with external spike-ins. This data collection, together with cdev, provides a valuable toolset for benchmarking new and existing normalization methods.

A graph-based algorithm for RNA-seq data normalization

Bhaskara

Might

et al. 2018

Preprint

16The use of RNA-sequencing has garnered much attention in the recent years for characterizing 17 and understanding various biological systems. However, it remains a major challenge to gain 18 insights from a large number of RNA-seq experiments collectively, due to the normalization 19 problem. Current normalization methods are based on assumptions that fail to hold when RNA-20 seq profiles become more abundant and heterogeneous. We present a normalization procedure 21 that does not rely on these assumptions, or on prior knowledge about the reference transcripts 22 in those conditions. This algorithm is based on a graph constructed from intrinsic correlations 23 among RNA-seq transcripts and seeks to identify a set of densely connected vertices as references. 24Application of this algorithm on our benchmark data showed that it can recover the reference 25 transcripts with high precision, thus resulting in high-quality normalization. As demonstrated 26 on a real data set, this algorithm gives good results and is efficient enough to be applicable to 27 real-life data.

anexVis: visual analytics framework for analysis of RNA expression

Zhang

Stutsman

et al. 2018

Circ: Genomic and Precision Medicine

HeartBioPortal

Khomtchouk

Vand²,

Koehler³

et al. 2019

Cardiovascular disease (CVD) is the leading cause of death worldwide, responsible for over 17 million deaths annually, a rate which outpaces even that related to cancer. Despite these sobering statistics, the state-of-the-art in computational infrastructure for the study of contemporary datasets related to CVD lags substantially behind that widely available in oncology, where improved data science and visualization methods have delivered publicly available comprehensive cancer genomics resources like Memorial Sloan Kettering Cancer Center's cBioPortal 1,2 and the National Cancer Institute's Genomic Data Commons (GDC) Portal 3,4. In our view, such portals do an outstanding job of transforming data from The Cancer Genome Atlas (TCGA) into logical data visualizations that provide additional biological insight. Developing a similar user-friendly computational platform for CVD could significantly lower the barriers of discovery by providing researchers with rapid, intuitive,