2022
DOI: 10.48550/arxiv.2204.00958
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Long-tailed Extreme Multi-label Text Classification with Generated Pseudo Label Descriptions

Abstract: Extreme Multi-label Text Classification (XMTC) has been a tough challenge in machine learning research and applications due to the sheer sizes of the label spaces and the severe data scarce problem associated with the long tail of rare labels in highly skewed distributions. This paper addresses the challenge of tail label prediction by proposing a novel approach, which combines the effectiveness of a trained bag-of-words (BoW) classifier in generating informative label descriptions under severe data scarce con… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 18 publications
(36 reference statements)
0
1
0
Order By: Relevance
“…Even though previous techniques have achieved encouraging performance in MLTC, it is still a challenging task due to the long-tailed label distribution (Chang et al 2020;Xiao et al 2021;Zhang et al 2022b). In this case, training classification models for the tail-labels is much more difficult than that for head-labels, which suffer severely from the lack of sufficient training instances.…”
Section: Multi-label Text Classificationmentioning
confidence: 99%
“…Even though previous techniques have achieved encouraging performance in MLTC, it is still a challenging task due to the long-tailed label distribution (Chang et al 2020;Xiao et al 2021;Zhang et al 2022b). In this case, training classification models for the tail-labels is much more difficult than that for head-labels, which suffer severely from the lack of sufficient training instances.…”
Section: Multi-label Text Classificationmentioning
confidence: 99%