Efficient Strategies for Hierarchical Text Classification: External Knowledge and Auxiliary Tasks

Rojas, Kervy Rivas; Bustamante, Gina; Oncevay, Arturo; Cabezudo, Marco Antonio Sobrevilla

doi:10.18653/v1/2020.acl-main.205

Cited by 17 publications

(15 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, in [1], the outputs of word-level tasks are fed to the char-level primary task. [99] feeds the output of more general classification models to more specific classification models during training, and the more general classification results are used to optimize beam search of more specific models at test time.…”

Section: Hierarchicalmentioning

confidence: 99%

“…Similar to the data sampling in Section 3.2, we can assign a task sampling weight 𝑟 𝑡 for task 𝑡, which is also called mixing ratio, to control the frequency of data batches from task 𝑡. The most common task scheduling technique is to shuffle between different tasks [5,20,30,33,38,44,51,71,73,79,80,89,93,99,102,108,109,114,118], either randomly or according to a pre-defined schedule. While random shuffling is widely adopted, introducing more heuristics into scheduling could help further improving the performance of MTL models.…”

Section: Task Schedulingmentioning

confidence: 99%

“…[9] enhances a text-to-SQL semantic parser by adding explicit condition value detection and value-column mapping as auxiliary tasks. [99] views hierarchical text classification, where each text may have several labels on different levels, as a generation tasks by generating from more general labels to more specific ones, and an auxiliary task of generating in the opposite order is introduced to guide the model to treat high-level and low-level labels more equally and therefore learn more robust representations.…”

Section: Primary Taskmentioning

confidence: 99%

See 2 more Smart Citations

Multi-Task Learning in Natural Language Processing: An Overview

Chen¹,

Qiang²

2021

Preprint

View full text Add to dashboard Cite

Deep learning approaches have achieved great success in the field of Natural Language Processing (NLP). However, deep neural models often suffer from overfitting and data scarcity problems that are pervasive in NLP tasks. In recent years, Multi-Task Learning (MTL), which can leverage useful information of related tasks to achieve simultaneous performance improvement on multiple related tasks, has been used to handle these problems. In this paper, we give an overview of the use of MTL in NLP tasks. We first review MTL architectures used in NLP tasks and categorize them into four classes, including the parallel architecture, hierarchical architecture, modular architecture, and generative adversarial architecture. Then we present optimization techniques on loss construction, data sampling, and task scheduling to properly train a multi-task model. After presenting applications of MTL in a variety of NLP tasks, we introduce some benchmark datasets. Finally, we make a conclusion and discuss several possible research directions in this field.

show abstract

Section: Hierarchicalmentioning

confidence: 99%

Section: Task Schedulingmentioning

confidence: 99%

Section: Primary Taskmentioning

confidence: 99%

See 1 more Smart Citation

Multi-Task Learning in Natural Language Processing: An Overview

Chen¹,

Qiang²

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Compared with one-hot representations, label embeddings have advantages in capturing domain-specific information and importing external knowledge. In the field of text classification (includes the HTC task), researchers propose several forms of label embeddings to encode different kinds of information, such as 1) anchor points (Du et al, 2019), 2) compatibility between labels and words Huang et al, 2019;Tang et al, 2015), 3) taxonomic hierarchy (Cao et al, 2020;Zhou et al, 2020) and 4) external knowledge (Rivas Rojas et al, 2020).…”

Section: Introductionmentioning

confidence: 99%

Concept-Based Label Embedding via Dynamic Routing for Hierarchical Text Classification

Wang¹,

Li²,

Liu³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Hierarchical Text Classification (HTC) is a challenging task that categorizes a textual description within a taxonomic hierarchy. Most of the existing methods focus on modeling the text. Recently, researchers attempt to model the class representations with some resources (e.g., external dictionaries). However, the concept shared among classes which is a kind of domain-specific and fine-grained information has been ignored in previous work. In this paper, we propose a novel concept-based label embedding method that can explicitly represent the concept and model the sharing mechanism among classes for the hierarchical text classification. Experimental results on two widely used datasets prove that the proposed model outperforms several state-of-theart methods. We release our complementary resources (concepts and definitions of classes) for these two datasets to benefit the research on HTC.

show abstract

“…In this work, a HMTC model with a label-based attention module is proposed for text classification. Different from Huang et al (2019); Rojas et al (2020) where hierarchical feature extraction is realized by applying general attention over the whole text, LA-HCN is designed to extract key information based on different labels at different hierarchical levels. Comparing with normal attention, label-based attention is more helpful for human understanding on the classification results which makes the model more explainable and interpretable.…”

Section: Introductionmentioning

confidence: 99%

LA-HCN: Label-based Attention for Hierarchical Multi-label TextClassification Neural Network

Zhang¹,

Xu²,

Soh³

et al. 2020

Preprint

View full text Add to dashboard Cite

Hierarchical multi-label text classification(HMTC) problems become popular recently because of its practicality. Most existing algorithms for HMTC focus on the design of classifiers, and are largely referred to as local, global, or a combination of local/global approaches. However, a few studies have started exploring hierarchical feature extraction based on the label hierarchy associating with text in HMTC. In this paper, a Neural network-based method called LA-HCN is proposed where a novel Label-based Attention module is designed to hierarchically extract important information from the text based on different labels. Besides, local and global document embeddings are separately generated to support the respective local and global classifications. In our experiments, LA-HCN achieves the top performance on the four public HMTC datasets when compared with other neural network-based state-of-theart algorithms. The comparison between LA-HCN with its variants also demonstrates the effectiveness of the proposed label-based attention module as well as the use of the combination of local and global classifications. By visualizing the learned attention(words), we find LA-HCN is able to extract meaningful but different information from text based on different labels which is helpful for human understanding and explanation of classification results.

show abstract

Efficient Strategies for Hierarchical Text Classification: External Knowledge and Auxiliary Tasks

Cited by 17 publications

References 23 publications

Multi-Task Learning in Natural Language Processing: An Overview

Multi-Task Learning in Natural Language Processing: An Overview

Concept-Based Label Embedding via Dynamic Routing for Hierarchical Text Classification

LA-HCN: Label-based Attention for Hierarchical Multi-label TextClassification Neural Network

Contact Info

Product

Resources

About