Hangyu Li scite author profile

Background: Various methods for differential expression analysis have been widely used to identify features which best distinguish between different categories of samples. Multiple hypothesis testing may leave out explanatory features, each of which may be composed of individually insignificant variables. Multivariate hypothesis testing holds a non-mainstream position, considering the large computation overhead of large-scale matrix operation. Random forest provides a classification strategy for calculation of variable importance. However, it may be unsuitable for different distributions of samples. Results: Based on the thought of using an ensemble classifier, we develop a feature selection tool for differential expression analysis on expression profiles (i.e., ECFS-DEA for short). Considering the differences in sample distribution, a graphical user interface is designed to allow the selection of different base classifiers. Inspired by random forest, a common measure which is applicable to any base classifier is proposed for calculation of variable importance. After an interactive selection of a feature on sorted individual variables, a projection heatmap is presented using k-means clustering. ROC curve is also provided, both of which can intuitively demonstrate the effectiveness of the selected feature. Conclusions: Feature selection through ensemble classifiers helps to select important variables and thus is applicable for different sample distributions. Experiments on simulation and realistic data demonstrate the effectiveness of ECFS-DEA for differential expression analysis on expression profiles. The software is available at http://bio-nefu.com/resource/ecfs-dea.

show abstract

Carbon capture and storage in the coastal region of China between Shanghai and Hainan

Zhang

Lau

Liu

et al. 2022

Energy

View full text Add to dashboard Cite

Pore-scale flow simulation on the permeability in hydrate-bearing sediments

et al. 2022

Fuel

View full text Add to dashboard Cite

A simple model for the prediction of mutual solubility in CO2-brine system at geological conditions

Sun

Wang

et al. 2021

Desalination

View full text Add to dashboard Cite

Near-well upscaling for three-phase flows

Nakashima

Durlofsky

2011

Comput Geosci

View full text Add to dashboard Cite

Large-scale flow models constructed using standard coarsening procedures may not accurately resolve detailed near-well effects. Such effects are often important to capture, however, as the interaction of the well with the formation can have a dominant impact on process performance. In this work, a near-well upscaling procedure, which provides three-phase wellblock properties, is developed and tested. The overall approach represents an extension of a recently developed oil-gas upscaling procedure and entails the use of local well computations (over a region referred to as the local well model (LWM)) along with a gradient-based optimization procedure to minimize the mismatch between fine and coarse-scale well rates, for oil, gas, and water, over the LWM. The gradients required for the minimization are computed efficiently through solution of adjoint equations. The LWM boundary conditions are determined using an iterative local-global procedure. With this approach, pressures and saturations computed during a global coarse-scale simulation are interpolated onto LWM boundaries and then used as boundary conditions for the fine-scale LWM computations. In addition to extending the overall approach to the three-phase case, this work also introduces new treatments that provide improved accuracy in cases with significant flux from the gas cap into the well block. The near-well multiphase upscaling method is applied to heterogeneous reservoir models, with production from vertical and horizontal wells. Simulation results illustrate that the method is able to accurately capture key near-well effects and to provide predictions for component production rates that are in close agreement with reference fine-scale results. The level of accuracy of the procedure is shown to be significantly higher than that of a standard approach which uses only upscaled single-phase flow parameters.

show abstract

Learning Low-Resource End-To-End Goal-Oriented Dialog for Fast and Reliable System Deployment

Dai¹,

Li²,

Tang³

et al. 2020

View full text Add to dashboard Cite

Existing end-to-end dialog systems perform less effectively when data is scarce. To obtain an acceptable success in real-life online services with only a handful of training examples, both fast adaptability and reliable performance are highly desirable for dialog systems. In this paper, we propose the Meta-Dialog System (MDS), which combines the advantages of both meta-learning approaches and human-machine collaboration. We evaluate our methods on a new extended-bAbI dataset and a transformed MultiWOZ dataset for lowresource goal-oriented dialog learning. Experimental results show that MDS significantly outperforms non-meta-learning baselines and can achieve more than 90% per-turn accuracies with only 10 dialogs on the extended-bAbI dataset.

show abstract

CO2 storage potential in major oil and gas reservoirs in the northern South China Sea

Lau

Wei

et al. 2021

International Journal of Greenhouse Gas Control

View full text Add to dashboard Cite

Preview, Attend and Review: Schema-Aware Curriculum Learning for Multi-Domain Dialogue State Tracking

Dai¹,

Li²,

Li³

et al. 2021

View full text Add to dashboard Cite

Existing dialog state tracking (DST) models are trained with dialog data in a random order, neglecting rich structural information in a dataset. In this paper, we propose to use curriculum learning (CL) to better leverage both the curriculum structure and schema structure for task-oriented dialogs. Specifically, we propose a model-agnostic framework called Schema-aware Curriculum Learning for Dialog State Tracking (SaCLog), which consists of a preview module that pre-trains a DST model with schema information, a curriculum module that optimizes the model with CL, and a review module that augments mispredicted data to reinforce the CL training. We show that our proposed approach improves DST performance over both a transformerbased and RNN-based DST model (TripPy and TRADE) and achieves new state-of-the-art results on WOZ2.0 and MultiWOZ2.1.

show abstract

12 3 4 5 6

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hangyu Li

ECFS-DEA: an ensemble classifier-based feature selection for differential expression analysis on expression profiles

Carbon capture and storage in the coastal region of China between Shanghai and Hainan

Pore-scale flow simulation on the permeability in hydrate-bearing sediments

A simple model for the prediction of mutual solubility in CO2-brine system at geological conditions

Near-well upscaling for three-phase flows

Learning Low-Resource End-To-End Goal-Oriented Dialog for Fast and Reliable System Deployment

CO2 storage potential in major oil and gas reservoirs in the northern South China Sea

Preview, Attend and Review: Schema-Aware Curriculum Learning for Multi-Domain Dialogue State Tracking

Contact Info

Product

Resources

About