Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval 2020
DOI: 10.1145/3397271.3401107
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing Text Classification via Discovering Additional Semantic Clues from Logograms

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(8 citation statements)
references
References 18 publications
0
6
0
Order By: Relevance
“…TEXTCNN (Kim, 2014) is a classical classifier that uses convolutional neural networks (CNN) with scale-variant convolution filters to capture local textual features, which may potentially capture spurious correlations between certain keywords and categories. LECO (Qian et al, 2020b) utilizes the combination of the implicit encoding of deep linguistic information and the explicit encoding of morphological features, which would also capture the keyword bias inadvertently. Besides, it uses a sentence-level over-sampling mechanism (He and Garcia, 2009) to mitigate the label bias, and we further enhance it via a powerful word-level augmentation technique (EDA) (Wei and Zou, 2019) to mitigate the keyword bias, denoted as LECOEDA.…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…TEXTCNN (Kim, 2014) is a classical classifier that uses convolutional neural networks (CNN) with scale-variant convolution filters to capture local textual features, which may potentially capture spurious correlations between certain keywords and categories. LECO (Qian et al, 2020b) utilizes the combination of the implicit encoding of deep linguistic information and the explicit encoding of morphological features, which would also capture the keyword bias inadvertently. Besides, it uses a sentence-level over-sampling mechanism (He and Garcia, 2009) to mitigate the label bias, and we further enhance it via a powerful word-level augmentation technique (EDA) (Wei and Zou, 2019) to mitigate the keyword bias, denoted as LECOEDA.…”
Section: Discussionmentioning
confidence: 99%
“…The core idea of CORSAIR is to train a "poisonous" text classifier regardless the dataset biases and post-adjust the biased predictions according to the causes of the biases in inference. It's worth mentioning that our proposed CORSAIR can be applied to almost any parameterized base model, including traditional one-stage classifiers (e.g., TEXTCNN (Kim, 2014), RCNN (Lai et al, 2015) and LECO (Qian et al, 2020b)) and currently prevalent two-stage classifiers 2 (e.g., ULM- Figure 1: The architecture of our proposed model-agnostic framework (CORSAIR). Specifically, CORSAIR first trains a base model on the training data directly so as to preserve the dataset biases in the trained model.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations