Han-Shen Huang scite author profile

Han-Shen Huang

5Publications

113Citation Statements Received

85Citation Statements Given

How they've been cited

138

111

How they cite others

Affiliations

Institute of Information Science, Academia Sinica, National Taiwan University

Publications

Order By: Most citations

Integrating high dimensional bi-directional parsing models for gene mention tagging

Hsu

Chang

Kuo

et al. 2008

View full text Add to dashboard Cite

Motivation: Tagging gene and gene product mentions in scientific text is an important initial step of literature mining. In this article, we describe in detail our gene mention tagger participated in BioCreative 2 challenge and analyze what contributes to its good performance. Our tagger is based on the conditional random fields model (CRF), the most prevailing method for the gene mention tagging task in BioCreative 2. Our tagger is interesting because it accomplished the highest F-scores among CRF-based methods and second over all. Moreover, we obtained our results by mostly applying open source packages, making it easy to duplicate our results.Results: We first describe in detail how we developed our CRF-based tagger. We designed a very high dimensional feature set that includes most of information that may be relevant. We trained bi-directional CRF models with the same set of features, one applies forward parsing and the other backward, and integrated two models based on the output scores and dictionary filtering. One of the most prominent factors that contributes to the good performance of our tagger is the integration of an additional backward parsing model. However, from the definition of CRF, it appears that a CRF model is symmetric and bi-directional parsing models will produce the same results. We show that due to different feature settings, a CRF model can be asymmetric and the feature setting for our tagger in BioCreative 2 not only produces different results but also gives backward parsing models slight but constant advantage over forward parsing model. To fully explore the potential of integrating bi-directional parsing models, we applied different asymmetric feature settings to generate many bi-directional parsing models and integrate them based on the output scores. Experimental results show that this integrated model can achieve even higher F-score solely based on the training corpus for gene mention tagging.Availability: Data sets, programs and an on-line service of our gene mention tagger can be accessed at http://aiia.iis.sinica.edu.tw/biocreative2.htmContact: chunnan@iis.sinica.edu.tw

show abstract

Mining Skewed and Sparse Transaction Data for Personalized Shopping Recommendation

2004

View full text Add to dashboard Cite

Abstract.A good shopping recommender system can boost sales in a retailer store. To provide accurate recommendation, the recommender needs to accurately predict a customer's preference, an ability difficult to acquire. Conventional data mining techniques, such as association rule mining and collaborative filtering, can generally be applied to this problem, but rarely produce satisfying results due to the skewness and sparsity of transaction data. In this paper, we report the lessons that we learned in two real-world data mining applications for personalized shopping recommendation. We learned that extending a collaborative filtering method based on ratings (e.g., GroupLens) to perform personalized shopping recommendation is not trivial and that it is not appropriate to apply association-rule based methods (e.g., the IBM SmartPad system) for large scale prediction of customers' shopping preferences. Instead, a probabilistic graphical model can be more effective in handling skewed and sparse data. By casting collaborative filtering algorithms in a probabilistic framework, we derived HyPAM (Hybrid Poisson Aspect Modelling), a novel probabilistic graphical model for personalized shopping recommendation. Experimental results show that HyPAM outperforms GroupLens and the IBM method by generating much more accurate predictions of what items a customer will actually purchase in the unseen test data. The data sets and the results are made available for download at

show abstract

Periodic step-size adaptation in second-order gradient descent for single-pass on-line structured learning

et al. 2009

View full text Add to dashboard Cite

It has been established that the second-order stochastic gradient descent (SGD) method can potentially achieve generalization performance as well as empirical optimum in a single pass through the training examples. However, second-order SGD requires computing the inverse of the Hessian matrix of the loss function, which is prohibitively expensive for structured prediction problems that usually involve a very high dimensional feature space. This paper presents a new second-order SGD method, called Periodic Step-size Adaptation (PSA). PSA approximates the Jacobian matrix of the mapping function and explores a linear relation between the Jacobian and Hessian to approximate the Hessian, which is proved to be simpler and more effective than directly approximating Hessian in an on-line setting. We tested PSA on a wide variety of models and tasks, including large scale sequence labeling tasks using conditional random fields and large scale classification tasks using linear support vector machines and convolutional neural networks. Experimental results show that single-pass performance of PSA is always very close to empirical optimum.

show abstract

Parameter learning of personalized trust models in broker-based distributed trust management

et al. 2006

View full text Add to dashboard Cite

Distributed trust management addresses the challenges of eliciting, evaluating and propagating trust for service providers on the distributed network. By delegating trust management to brokers, individual users can share their feedbacks for services without the overhead of maintaining their own ratings. This research proposes a two-tier trust hierarchy, in which a user relies on her broker to provide reputation rating about any service provider, while brokers leverage their connected partners in aggregating the reputation of unfamiliar service providers. Each broker collects feedbacks from its users on past transactions. To accommodate individual differences, personalized trust is modeled with a Bayesian network. Training strategies such as the expectation maximization (EM) algorithm can be deployed to estimate both server reputation and user bias. This paper presents the design and implementation of a distributed trust simulator, which supports experiments under different configurations. In addition, we have conducted experiments to show the following. 1) Personal rating error converges to below 5% consistently within 10,000 transactions regardless of the training strategy or bias distribution. 2) The choice of trust model has a significant impact on the performance of reputation prediction. 3) The two-tier trust framework scales well to distributed environments. In summary, parameter learning of trust models in the broker-based framework enables both aggregation of feedbacks and personalized reputation prediction.

show abstract

Distinguishing two-component anomalous Hall effect from topological Hall effect

Tai¹,

Dai²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Han-Shen Huang

Integrating high dimensional bi-directional parsing models for gene mention tagging

Mining Skewed and Sparse Transaction Data for Personalized Shopping Recommendation

Periodic step-size adaptation in second-order gradient descent for single-pass on-line structured learning

Parameter learning of personalized trust models in broker-based distributed trust management

Distinguishing two-component anomalous Hall effect from topological Hall effect

Contact Info

Product

Resources

About