2016
DOI: 10.48550/arxiv.1606.06031
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The LAMBADA dataset: Word prediction requiring a broad discourse context

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
49
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 34 publications
(49 citation statements)
references
References 0 publications
0
49
0
Order By: Relevance
“…There has been a flurry of work in constructing datasets with an adversarial component, such as Swag (Zellers et al, 2018) and HellaSwag (Zellers et al, 2019), CODAH , Adversarial SQuAD (Jia and Liang, 2017), Lambada (Paperno et al, 2016) and others. Our dataset is not to be confused with abductive NLI (Bhagavatula et al, 2019), which calls itself αNLI, or ART.…”
Section: Related Workmentioning
confidence: 99%
“…There has been a flurry of work in constructing datasets with an adversarial component, such as Swag (Zellers et al, 2018) and HellaSwag (Zellers et al, 2019), CODAH , Adversarial SQuAD (Jia and Liang, 2017), Lambada (Paperno et al, 2016) and others. Our dataset is not to be confused with abductive NLI (Bhagavatula et al, 2019), which calls itself αNLI, or ART.…”
Section: Related Workmentioning
confidence: 99%
“…Real-life event logs present temporally sequential data, which is complex, variable, has extensive dependencies due to multiple control flows. Recurrent neural networks, such as LSTM, struggle to reason over long-range sequences due to the limited size of a context vector, as noted in [23]. This paper addresses the problem of PBPM to predict the next activity, event time, and remaining time of a process under execution, i.e., using deep learning to learn the functions Θ a , Θ t , and Θ rt as they are defined in Definitions 3-5.…”
Section: Process Transformermentioning
confidence: 99%
“…These integer representations disregard the intrinsic relationship among events and introduce unrealistic computational requirements due to an increase of data dimensionality [12]. Secondly, LSTM lacks the explicit modeling of long and short-range dependencies in the sense that their performance degrades in proportion to the length of events sequences [23]. It is specifically undesired for event logs due to interconnections introduced by control flows among activities.…”
Section: Introductionmentioning
confidence: 99%
“…English Datasets: Machine Reading Comprehension tasks require a machine to answer a question based on the content in the given document. Early MRC datasets are primarily cloze/span-based, where the answer is simply a span in the document or a few words to be filled in the blank, including CNN/Daily Mail [15], LAMBADA [25], CBT [16], BookTest [2], Who-did-What [24] and CLOTH [41]. The famous SQuAD dataset [27,26] for the first time introduces human-generated free-form questions, which requires the machine to understand natural language to select the correct span in Wikipedia pages.…”
Section: Related Workmentioning
confidence: 99%