Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Rese 2021
DOI: 10.18653/v1/2021.naacl-srw.14
|View full text |Cite
|
Sign up to set email alerts
|

Seed Word Selection for Weakly-Supervised Text Classification with Unsupervised Error Estimation

Abstract: Weakly-supervised text classification aims to induce text classifiers from only a few userprovided seed words. The vast majority of previous work assumes high-quality seed words are given. However, the expert-annotated seed words are sometimes non-trivial to come up with. Furthermore, in the weakly-supervised learning setting, we do not have any labeled document to measure the seed words' efficacy, making the seed word selection process "a walk in the dark". In this work, we remove the need for expert-curated … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(1 citation statement)
references
References 15 publications
(6 reference statements)
0
0
0
Order By: Relevance
“…The method requires a handful of labeled keywords instead of a large corpus of labeled documents and can be easily transferred to new domains. We further evaluate the weakly-supervised models using unsupervised error estimation and perform automatic keyword selection (Jin et al, 2021a). Unsupervised error estimation is essential because no labeled development dataset is available in real-world problems where weakly-supervised text classification methods are applied.Finally, we tap on a state-of-the-art sequence-to-sequence Transformer model to generate cohesive and diverse advertising slogans from a short company description (Jin et al, In press).…”
mentioning
confidence: 99%
“…The method requires a handful of labeled keywords instead of a large corpus of labeled documents and can be easily transferred to new domains. We further evaluate the weakly-supervised models using unsupervised error estimation and perform automatic keyword selection (Jin et al, 2021a). Unsupervised error estimation is essential because no labeled development dataset is available in real-world problems where weakly-supervised text classification methods are applied.Finally, we tap on a state-of-the-art sequence-to-sequence Transformer model to generate cohesive and diverse advertising slogans from a short company description (Jin et al, In press).…”
mentioning
confidence: 99%