2019
DOI: 10.1609/aaai.v33i01.33016407
|View full text |Cite
|
Sign up to set email alerts
|

Hybrid Attention-Based Prototypical Networks for Noisy Few-Shot Relation Classification

Abstract: The existing methods for relation classification (RC) primarily rely on distant supervision (DS) because large-scale supervised training datasets are not readily available. Although DS automatically annotates adequate amounts of data for model training, the coverage of this data is still quite limited, and meanwhile many long-tail relations still suffer from data sparsity. Intuitively, people can grasp new knowledge by learning few instances. We thus provide a different view on RC by formalizing RC as a few-sh… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
266
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
5

Relationship

2
8

Authors

Journals

citations
Cited by 295 publications
(269 citation statements)
references
References 13 publications
3
266
0
Order By: Relevance
“…There have been emerging research studies that utilize the above meta-learning algorithms to NLP tasks, including language modelling (Vinyals et al, 2016), text classification , machine translation (Gu et al, 2018), and relation learning (Xiong et al, 2018;Gao et al, 2019). In this paper, we propose to formulate the OOV word representation learning as a few-shot regression problem.…”
Section: Related Workmentioning
confidence: 99%
“…There have been emerging research studies that utilize the above meta-learning algorithms to NLP tasks, including language modelling (Vinyals et al, 2016), text classification , machine translation (Gu et al, 2018), and relation learning (Xiong et al, 2018;Gao et al, 2019). In this paper, we propose to formulate the OOV word representation learning as a few-shot regression problem.…”
Section: Related Workmentioning
confidence: 99%
“…Oreshkin et al [26] also learns a task-dependent metric, but conditions based on the mean of class prototypes, which can reduce inter-class variations available to their task conditioning network, and requires an auxiliary task co-training loss not needed by our method to realize performance gains. Gao et al [9] applied masks to features in a prototypical network applied to a NLP few-shot sentence classification task, but base their masks only on examples within each class, not between classes as our method does.…”
Section: Related Workmentioning
confidence: 99%
“…Afterward the encoder compares the new sample with prototypes, and classifies it to the class with the closest prototype [28]. Previous studies [8,28] demonstrate that selection of distance functions will significantly affect the capacity of prototypical networks, so that the model performance is vulnerable to instance representations. However, due to the paucity of instances in FSL, key information may be lost in noise brought by the diversity of event mentions.…”
Section: Label Not Seen In Trainingmentioning
confidence: 99%