Improving Disentangled Text Representation Learning with Information-Theoretic Guidance

Few-shot text classification aims to classify inputs whose label has only a few examples.Previous studies overlooked the semantic relevance between label representations. Therefore, they are easily confused by labels that are semantically relevant. To address this problem, we propose a method that generates distinct label representations that embed information specific to each label. Our method is widely applicable to conventional few-shot classification models. Experimental results show that our method significantly improved the performance of few-shot text classification across models and datasets.

show abstract

“…Because the exact value of Equation ( 8) is difficult to calculate in practice, we minimize its upperbound following Cheng et al (2020):…”

Section: Design Of Loss Functionmentioning

confidence: 99%

Distinct Label Representations for Few-Shot Text Classification

Ohashi¹,

Takayama²,

Kajiwara³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

show abstract

“…Third, it is possible that one feature detects several patterns (Jacovi et al, 2018) and it will be difficult to disable the feature if some of the detected patterns are useful while the others are harmful. Hence, FIND would be more effective when used together with disentangled text representations (Cheng et al, 2020).…”

Section: Limitationsmentioning

confidence: 99%

FIND: Human-in-the-Loop Debugging Deep Text Classifiers

Lertvittayakumjorn¹,

Specia²,

Toni³

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Since obtaining a perfect training dataset (i.e., a dataset which is considerably large, unbiased, and well-representative of unseen cases) is hardly possible, many real-world text classifiers are trained on the available, yet imperfect, datasets. These classifiers are thus likely to have undesirable properties. For instance, they may have biases against some sub-populations or may not work effectively in the wild due to overfitting. In this paper, we propose FINDa framework which enables humans to debug deep learning text classifiers by disabling irrelevant hidden features. Experiments show that by using FIND, humans can improve CNN text classifiers which were trained under different types of imperfect datasets (including datasets with biases and datasets with dissimilar traintest distributions).

show abstract

“…The sentiment information is contained in w t , while the content of the original sentence is represented by O t . To achieve styletransfer, one feeds the original sentence X with the target style label l to get the transferred sentence Y with style l. Following previous work (Hu et al, 2017;Yang et al, 2018;Cheng et al, 2020), we adopt a classifier as the discriminator and the soft-argmax approach (Kusner and Miguel, 2016) for the update of generator instead of policy gradient (Sutton and Barto, 1998).…”

Section: Extension To Non-parallel Text Style Transfermentioning

confidence: 99%

Improving Adversarial Text Generation by Modeling the Distant Future

Zhang¹,

Chen²,

Gan³

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Self Cite

View full text Add to dashboard Cite

Auto-regressive text generation models usually focus on local fluency, and may cause inconsistent semantic meaning in long text generation. Further, automatically generating words with similar semantics is challenging, and hand-crafted linguistic rules are difficult to apply. We consider a text planning scheme and present a model-based imitation-learning approach to alleviate the aforementioned issues. Specifically, we propose a novel guider network to focus on the generative process over a longer horizon, which can assist next-word prediction and provide intermediate rewards for generator optimization. Extensive experiments demonstrate that the proposed method leads to improved performance.

show abstract

Improving Disentangled Text Representation Learning with Information-Theoretic Guidance

Cited by 45 publications

References 21 publications

Distinct Label Representations for Few-Shot Text Classification

Distinct Label Representations for Few-Shot Text Classification

FIND: Human-in-the-Loop Debugging Deep Text Classifiers

Improving Adversarial Text Generation by Modeling the Distant Future

Contact Info

Product

Resources

About