Li Huang scite author profile

Recently, a growing number of biological research and scientific experiments have demonstrated that microRNA (miRNA) affects the development of human complex diseases. Discovering miRNA-disease associations plays an increasingly vital role in devising diagnostic and therapeutic tools for diseases. However, since uncovering associations via experimental methods is expensive and time-consuming, novel and effective computational methods for association prediction are in demand. In this study, we developed a computational model of Matrix Decomposition and Heterogeneous Graph Inference for miRNA-disease association prediction (MDHGI) to discover new miRNA-disease associations by integrating the predicted association probability obtained from matrix decomposition through sparse learning method, the miRNA functional similarity, the disease semantic similarity, and the Gaussian interaction profile kernel similarity for diseases and miRNAs into a heterogeneous network. Compared with previous computational models based on heterogeneous networks, our model took full advantage of matrix decomposition before the construction of heterogeneous network, thereby improving the prediction accuracy. MDHGI obtained AUCs of 0.8945 and 0.8240 in the global and the local leave-one-out cross validation, respectively. Moreover, the AUC of 0.8794+/-0.0021 in 5-fold cross validation confirmed its stability of predictive performance. In addition, to further evaluate the model's accuracy, we applied MDHGI to four important human cancers in three different kinds of case studies. In the first type, 98% (Esophageal Neoplasms) and 98% (Lymphoma) of top 50 predicted miRNAs have been confirmed by at least one of the two databases (dbDEMC and miR2Disease) or at least one experimental literature in PubMed. In the second type of case study, what made a difference was that we removed all known associations between the miRNAs and Lung Neoplasms before implementing MDHGI on Lung Neoplasms. As a result, 100% (Lung Neoplasms) of top 50 related miRNAs have been indexed by at least one of the three databases (dbDEMC, miR2Disease and HMDD V2.0) or at least one experimental literature in PubMed. Furthermore, we also tested our prediction method on the HMDD V1.0 database to prove the applicability of MDHGI to different datasets. The results showed that 50 out of top 50 miRNAs related with the breast neoplasms were validated by at least one of the three databases (HMDD V2.0, dbDEMC, and miR2Disease) or at least one experimental literature.

show abstract

Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records

Huang

Shea

Qian

et al. 2019

Journal of Biomedical Informatics

287

148

View full text Add to dashboard Cite

Electronic medical records (EMRs) supports the development of machine learning algorithms for predicting disease incidence, patient response to treatment, and other healthcare events. But insofar most algorithms have been centralized, taking little account of the decentralized, non-identically independently distributed (non-IID), and privacy-sensitive characteristics of EMRs that can complicate data collection, sharing and learning. To address this challenge, we introduced a community-based federated machine learning (CBFL) algorithm and evaluated it on non-IID ICU EMRs. Our algorithm clustered the distributed data into clinically meaningful communities that captured similar diagnoses and geological locations, and learnt one model for each community. Throughout the learning process, the data was kept local on hospitals, while locally-computed results were aggregated on a server. Evaluation results show that CBFL outperformed the baseline FL algorithm in terms of Area Under the Receiver Operating Characteristic Curve (ROC AUC), Area Under the Precision-Recall Curve (PR AUC), and communication cost between hospitals and the server. Furthermore, communities' performance difference could be explained by how dissimilar one community was to others.

show abstract

EGBMMDA: Extreme Gradient Boosting Machine for MiRNA-Disease Association prediction

et al. 2018

View full text Add to dashboard Cite

Associations between microRNAs (miRNAs) and human diseases have been identified by increasing studies and discovering new ones is an ongoing process in medical laboratories. To improve experiment productivity, researchers computationally infer potential associations from biological data, selecting the most promising candidates for experimental verification. Predicting potential miRNA–disease association has become a research area of growing importance. This paper presents a model of Extreme Gradient Boosting Machine for MiRNA-Disease Association (EGBMMDA) prediction by integrating the miRNA functional similarity, the disease semantic similarity, and known miRNA–disease associations. The statistical measures, graph theoretical measures, and matrix factorization results for each miRNA-disease pair were calculated and used to form an informative feature vector. The vector for known associated pairs obtained from the HMDD v2.0 database was used to train a regression tree under the gradient boosting framework. EGBMMDA was the first decision tree learning-based model used for predicting miRNA–disease associations. Respectively, AUCs of 0.9123 and 0.8221 in global and local leave-one-out cross-validation proved the model’s reliable performance. Moreover, the 0.9048 ± 0.0012 AUC in fivefold cross-validation confirmed its stability. We carried out three different types of case studies of predicting potential miRNAs related to Colon Neoplasms, Lymphoma, Prostate Neoplasms, Breast Neoplasms, and Esophageal Neoplasms. The results indicated that, respectively, 98%, 90%, 98%, 100%, and 98% of the top 50 predictions for the five diseases were confirmed by experiments. Therefore, EGBMMDA appears to be a useful computational resource for miRNA–disease association prediction.

show abstract

LRSSLMDA: Laplacian Regularized Sparse Subspace Learning for MiRNA-Disease Association prediction

2017

View full text Add to dashboard Cite

Predicting novel microRNA (miRNA)-disease associations is clinically significant due to miRNAs’ potential roles of diagnostic biomarkers and therapeutic targets for various human diseases. Previous studies have demonstrated the viability of utilizing different types of biological data to computationally infer new disease-related miRNAs. Yet researchers face the challenge of how to effectively integrate diverse datasets and make reliable predictions. In this study, we presented a computational model named Laplacian Regularized Sparse Subspace Learning for MiRNA-Disease Association prediction (LRSSLMDA), which projected miRNAs/diseases’ statistical feature profile and graph theoretical feature profile to a common subspace. It used Laplacian regularization to preserve the local structures of the training data and a L1-norm constraint to select important miRNA/disease features for prediction. The strength of dimensionality reduction enabled the model to be easily extended to much higher dimensional datasets than those exploited in this study. Experimental results showed that LRSSLMDA outperformed ten previous models: the AUC of 0.9178 in global leave-one-out cross validation (LOOCV) and the AUC of 0.8418 in local LOOCV indicated the model’s superior prediction accuracy; and the average AUC of 0.9181+/-0.0004 in 5-fold cross validation justified its accuracy and stability. In addition, three types of case studies further demonstrated its predictive power. Potential miRNAs related to Colon Neoplasms, Lymphoma, Kidney Neoplasms, Esophageal Neoplasms and Breast Neoplasms were predicted by LRSSLMDA. Respectively, 98%, 88%, 96%, 98% and 98% out of the top 50 predictions were validated by experimental evidences. Therefore, we conclude that LRSSLMDA would be a valuable computational tool for miRNA-disease association prediction.

show abstract

LoAdaBoost: Loss-based AdaBoost federated machine learning with reduced computational complexity on IID and non-IID intensive care data

et al. 2020

View full text Add to dashboard Cite

Intensive care data are valuable for improvement of health care, policy making and many other purposes. Vast amount of such data are stored in different locations, on many different devices and in different data silos. Sharing data among different sources is a big challenge due to regulatory, operational and security reasons. One potential solution is federated machine learning, which is a method that sends machine learning algorithms simultaneously to all data sources, trains models in each source and aggregates the learned models. This strategy allows utilization of valuable data without moving them. One challenge in applying federated machine learning is the possibly different distributions of data from diverse sources. To tackle this problem, we proposed an adaptive boosting method named LoAda-Boost that increases the efficiency of federated machine learning. Using intensive care unit data from hospitals, we investigated the performance of learning in IID and non-IID data distribution scenarios, and showed that the proposed LoAdaBoost method achieved higher predictive accuracy with lower computational complexity than the baseline method.

show abstract

New biotite and muscovite isotopic reference materials, USGS57 and USGS58, for δ2H measurements–A replacement for NBS 30

Coplen

Gehre

et al. 2017

Chemical Geology

View full text Add to dashboard Cite

The advent of continuous-flow isotope-ratio mass spectrometry (CF-IRMS) coupled with a high temperature conversion (HTC) system enabled faster, more cost effective, and more precise δ 2 H analysis of hydrogen-bearing solids. Accurate hydrogen isotopic analysis by on-line or off-line techniques requires appropriate isotopic reference materials (RM). A strategy of 2-point calibrations spanning δ 2 H range of the unknowns using two RMs is recommended. Unfortunately, the supply of the previously widely used isotopic reference material, NBS 30 biotite, is exhausted. In addition, recent measurements have shown that the determination of δ 2 H values of NBS 30 biotite on the VSMOW-SLAP isotope-delta scale by on-line HTC systems with CF-IRMS may be unreliable because hydrogen in this biotite may not be converted quantitatively to molecular hydrogen. The δ 2 H VSMOW-SLAP values of NBS 30 biotite analyzed by

show abstract

A 2.4 GHz ULP OOK Single-Chip Transceiver for Healthcare Applications

Vidojkovic

Huang

Harpe

et al. 2011

IEEE Trans. Biomed. Circuits Syst.

View full text Add to dashboard Cite

This paper describes an ultra-low power (ULP) single chip transceiver for wireless body area network (WBAN) applications. It supports on-off keying (OOK) modulation, and it operates in the 2.36-2.4 GHz medical BAN and 2.4-2.485 GHz ISM bands. It is implemented in 90 nm CMOS technology. The direct modulated transmitter transmits OOK signal with 0 dBm peak power, and it consumes 2.59 mW with 50% OOK. The transmitter front-end supports up to 10 Mbps. The transmitter digital baseband enables digital pulse-shaping to improve spectrum efficiency. The super-regenerative receiver front-end supports up to 5 Mbps with -75 dBm sensitivity. Including the digital part, the receiver consumes 715 μW at 1 Mbps data rate, oversampled at 3 MHz. At the system level the transceiver achieves PER=10 (-2) at 25 meters line of site with 62.5 kbps data rate and 288 bits packet size. The transceiver is integrated in an electrocardiogram (ECG) necklace to monitor the heart's electrical property.

show abstract

Large-scale stable isotope characterization of a Late Cretaceous dinosaur-dominated ecosystem

Cullen

Longstaffe

Wortmann

et al. 2020

View full text Add to dashboard Cite

In the Cretaceous of North America, environmental sensitivity and habitat specialization have been hypothesized to explain the surprisingly restricted geographic ranges of many large-bodied dinosaurs. Understanding the drivers behind this are key to determining broader trends of dinosaur species and community response to climate change under greenhouse conditions. However, previous studies of this question have commonly examined only small components of the paleo-ecosystem or operated without comparison to similar modern systems from which to constrain interpretations. Here we perform a high-resolution multi-taxic δ13C and δ18O study of a Cretaceous coastal floodplain ecosystem, focusing on species interactions and paleotemperature estimation, and compare with similar data from extant systems. Bioapatite δ13C preserves predator-prey offsets between tyrannosaurs and ornithischians (large herbivorous dinosaurs), and between aquatic reptiles and fish. Large ornithischians had broadly overlapping stable isotope ranges, contrary to hypothesized niche partitioning driven by specialization on coastal or inland subhabitat use. Comparisons to a modern analogue coastal floodplain show similar patterns of ecological guild structure and aquatic-terrestrial resource interchange. Multi-taxic oxygen isotope temperature estimations yield results for the Campanian of Alberta (Canada) consistent with the few other paleotemperature proxies available, and are validated when applied for extant species from a modern coastal floodplain, suggesting that this approach is a simple and effective avenue for paleoenvironmental reconstruction. Together, these new data suggest that dinosaur niche partitioning was more complex than previously hypothesized, and provide a framework for future research on dinosaur-dominated Mesozoic floodplain communities.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Li Huang

MDHGI: Matrix Decomposition and Heterogeneous Graph Inference for miRNA-disease association prediction

Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records

EGBMMDA: Extreme Gradient Boosting Machine for MiRNA-Disease Association prediction

LRSSLMDA: Laplacian Regularized Sparse Subspace Learning for MiRNA-Disease Association prediction

LoAdaBoost: Loss-based AdaBoost federated machine learning with reduced computational complexity on IID and non-IID intensive care data

New biotite and muscovite isotopic reference materials, USGS57 and USGS58, for δ2H measurements–A replacement for NBS 30

A 2.4 GHz ULP OOK Single-Chip Transceiver for Healthcare Applications

Large-scale stable isotope characterization of a Late Cretaceous dinosaur-dominated ecosystem

Contact Info

Product

Resources

About