Dingkun Long scite author profile

Hierarchical text classification is an essential yet challenging subtask of multi-label text classification with a taxonomic hierarchy. Existing methods have difficulties in modeling the hierarchical label structure in a global view. Furthermore, they cannot make full use of the mutual interactions between the text feature space and the label space. In this paper, we formulate the hierarchy as a directed graph and introduce hierarchy-aware structure encoders for modeling label dependencies. Based on the hierarchy encoder, we propose a novel end-to-end hierarchy-aware global model (Hi-AGM) with two variants. A multi-label attention variant (HiAGM-LA) learns hierarchyaware label embeddings through the hierarchy encoder and conducts inductive fusion of labelaware text features. A text feature propagation model (HiAGM-TP) is proposed as the deductive variant that directly feeds text features into hierarchy encoders. Compared with previous works, both HiAGM-LA and HiAGM-TP achieve significant and consistent improvements on three benchmark datasets.

show abstract

Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation

Ding¹,

Long²,

Xu³

et al. 2020

View full text Add to dashboard Cite

Fully supervised neural approaches have achieved significant progress in the task of Chinese word segmentation (CWS). Nevertheless, the performance of supervised models tends to drop dramatically when they are applied to outof-domain data. Performance degradation is caused by the distribution gap across domains and the out of vocabulary (OOV) problem. In order to simultaneously alleviate these two issues, this paper proposes to couple distant annotation and adversarial training for crossdomain CWS. For distant annotation, we rethink the essence of "Chinese words" and design an automatic distant annotation mechanism that does not need any supervision or pre-defined dictionaries from the target domain. The approach could effectively explore domain-specific words and distantly annotate the raw texts for the target domain. For adversarial training, we develop a sentence-level training procedure to perform noise reduction and maximum utilization of the source domain information. Experiments on multiple realworld datasets across various domains show the superiority and robustness of our model, significantly outperforming previous state-ofthe-art cross-domain CWS methods.

show abstract

Learning with Noise: Improving Distantly-Supervised Fine-grained Entity Typing via Automatic Relabeling

Zhang

Long

et al. 2020

View full text Add to dashboard Cite

Fine-grained entity typing (FET) is a fundamental task for various entity-leveraging applications. Although great success has been made, existing systems still have challenges in handling noisy samples in training data introduced by distant supervision methods. To address these noise, previous studies either focus on processing the clean samples (i,e., have only one label) and noisy samples (i,e., have multiple labels) with different strategies or filtering the noisy labels based on the assumption that the distantly-supervised label set certainly contains the correct type label. In this paper, we propose a probabilistic automatic relabeling method which treats all training samples uniformly. Our method aims to estimate the pseudo-truth label distribution of each sample, and the pseudo-truth distribution will be treated as part of trainable parameters which are jointly updated during the training process. The proposed approach does not rely on any prerequisite or extra supervision, making it effective on real applications. Experiments on several benchmarks show that our method outperforms previous approaches and alleviates the noisy labeling problem.

show abstract

HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking

Zhang¹,

Long²,

Xu³

et al. 2022

Preprint

View full text Add to dashboard Cite

Deep pre-trained language models (e,g. BERT) are effective at large-scale text retrieval task. Existing text retrieval systems with state-of-the-art performance usually adopt a retrieve-then-reranking architecture due to the high computational cost of pretrained language models and the large corpus size. Under such a multi-stage architecture, previous studies mainly focused on optimizing single stage of the framework thus improving the overall retrieval performance. However, how to directly couple multi-stage features for optimization has not been well studied. In this paper, we design Hybrid List Aware Transformer Reranking (HLATR) as a subsequent reranking module to incorporate both retrieval and reranking stage features. HLATR is lightweight and can be easily parallelized with existing text retrieval systems so that the reranking process can be performed in a single yet efficient processing. Empirical experiments on two large-scale text retrieval datasets show that HLATR can efficiently improve the ranking performance of existing multi-stage text retrieval methods 1 .

show abstract

Recurrent Neural Networks With Finite Memory Length

2019

View full text Add to dashboard Cite

The working of recurrent neural networks has not been well understood to date. The construction of such network models, hence, largely relies on heuristics and intuition. This paper formalizes the notion of ''memory length'' for recurrent networks and consequently discovers a generic family of recurrent networks having maximal memory lengths. Stacking such networks into multiple layers is shown to result in powerful models, including the gated convolutional networks. We show that the structure of such networks potentially enables a more principled design approach in practice and entails no gradient vanishing or exploding during back-propagation. We also present a new example in this family, termed attentive activation recurrent unit (AARU). Experimentally we demonstrate that the performance of this network family, particularly AARU, is superior to the LSTM and GRU networks. INDEX TERMS Recurrent neural networks, memory length.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dingkun Long

Hierarchy-Aware Global Model for Hierarchical Text Classification

Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation

Learning with Noise: Improving Distantly-Supervised Fine-grained Entity Typing via Automatic Relabeling

HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking

Recurrent Neural Networks With Finite Memory Length

Contact Info

Product

Resources

About