Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.129
|View full text |Cite
|
Sign up to set email alerts
|

MAVEN: A Massive General Domain Event Detection Dataset

Abstract: Event detection (ED), which means identifying event trigger words and classifying event types, is the first and most fundamental step for extracting event knowledge from plain text. Most existing datasets exhibit the following issues that limit further development of ED: (1) Data scarcity. Existing smallscale datasets are not sufficient for training and stably benchmarking increasingly sophisticated modern neural methods. (2) Low coverage. Limited event types of existing datasets cannot well cover general-doma… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
71
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 85 publications
(71 citation statements)
references
References 49 publications
0
71
0
Order By: Relevance
“…zero-shot learning (Huang et al, 2018), fewshot learning (Lai et al, 2020a,b), or new domains (Naik and Rosé, 2020). The closet works to ours involve recent efforts to create new datasets for EE (Satyapanich et al, 2020;Ebner et al, 2020;Wang et al, 2020;Trong et al, 2020;Le and Nguyen, 2021). However, these works do not consider historical texts as we do.…”
Section: Methodsmentioning
confidence: 93%
See 2 more Smart Citations
“…zero-shot learning (Huang et al, 2018), fewshot learning (Lai et al, 2020a,b), or new domains (Naik and Rosé, 2020). The closet works to ours involve recent efforts to create new datasets for EE (Satyapanich et al, 2020;Ebner et al, 2020;Wang et al, 2020;Trong et al, 2020;Le and Nguyen, 2021). However, these works do not consider historical texts as we do.…”
Section: Methodsmentioning
confidence: 93%
“…OneIE (Lin et al, 2020): This model first identifies spans of entity mentions and event triggers. The detected spans are then paired to jointly predict entity types, event types, relations, and argument roles for IE.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…(1) Overall Evaluation, (2) Few-shot Evaluation, and (3) Zero-shot Evaluation. OntoEvent is established based on two newly proposed datasets for ED: MAVEN (Wang et al, 2020b) and FewEvent (Deng et al, 2020). They are constructed from Wikipedia documents or based on existing event datasets, such as ACE-2005 5 and TAC-KBP-2017 6 .…”
Section: Methodsmentioning
confidence: 99%
“…As a non-trivial task, ED suffers from the lowresource issues. On the one hand, the maldistribu- tion of samples is quite serious in ED benchmark datasets, e.g., FewEvent (Deng et al, 2020) and MAVEN (Wang et al, 2020b), where a large portion of event types contain relatively few training instances. As shown in Figure 1, the sample size of two event types Attack and Riot differs greatly (4816 & 30).…”
Section: Introductionmentioning
confidence: 99%