The LAMBADA dataset: Word prediction requiring a broad discourse context

Paperno, Denis; Kruszewski, Germán; Lazaridou, Angeliki; Pham, Quan Ngoc; Bernardi, Raffaella; Pezzelle, Sandro; Baroni, Marco; Boleda, Gemma; Fernández, Raquel

doi:10.48550/arxiv.1606.06031

Cited by 34 publications

(49 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There has been a flurry of work in constructing datasets with an adversarial component, such as Swag (Zellers et al, 2018) and HellaSwag (Zellers et al, 2019), CODAH , Adversarial SQuAD (Jia and Liang, 2017), Lambada (Paperno et al, 2016) and others. Our dataset is not to be confused with abductive NLI (Bhagavatula et al, 2019), which calls itself αNLI, or ART.…”

Section: Related Workmentioning

confidence: 99%

Adversarial NLI: A New Benchmark for Natural Language Understanding

Nie

Williams

Dinan

et al. 2019

Preprint

View full text Add to dashboard Cite

We introduce a new large-scale NLI benchmark dataset, collected via an iterative, adversarial human-and-model-in-the-loop procedure. We show that training models on this new dataset leads to state-of-the-art performance on a variety of popular NLI benchmarks, while posing a more difficult challenge with its new test set. Our analysis sheds light on the shortcomings of current state-of-theart models, and shows that non-expert annotators are successful at finding their weaknesses. The data collection method can be applied in a never-ending learning scenario, becoming a moving target for NLU, rather than a static benchmark that will quickly saturate.

show abstract

Section: Related Workmentioning

confidence: 99%

Adversarial NLI: A New Benchmark for Natural Language Understanding

Nie

Williams

Dinan

et al. 2019

Preprint

View full text Add to dashboard Cite

show abstract

“…Real-life event logs present temporally sequential data, which is complex, variable, has extensive dependencies due to multiple control flows. Recurrent neural networks, such as LSTM, struggle to reason over long-range sequences due to the limited size of a context vector, as noted in [23]. This paper addresses the problem of PBPM to predict the next activity, event time, and remaining time of a process under execution, i.e., using deep learning to learn the functions Θ a , Θ t , and Θ rt as they are defined in Definitions 3-5.…”

Section: Process Transformermentioning

confidence: 99%

“…These integer representations disregard the intrinsic relationship among events and introduce unrealistic computational requirements due to an increase of data dimensionality [12]. Secondly, LSTM lacks the explicit modeling of long and short-range dependencies in the sense that their performance degrades in proportion to the length of events sequences [23]. It is specifically undesired for event logs due to interconnections introduced by control flows among activities.…”

Section: Introductionmentioning

confidence: 99%

ProcessTransformer: Predictive Business Process Monitoring with Transformer Network

Bukhsh,

Saeed,

Dijkman

2021

Preprint

View full text Add to dashboard Cite

Predictive business process monitoring focuses on predicting future characteristics of a running process using event logs. The foresight into process execution promises great potentials for efficient operations, better resource management, and effective customer services. Deep learning-based approaches have been widely adopted in process mining to address the limitations of classical algorithms for solving multiple problems, especially the next event and remaining-time prediction tasks. Nevertheless, designing a deep neural architecture that performs competitively across various tasks is challenging as existing methods fail to capture long-range dependencies in the input sequences and perform poorly for lengthy process traces. In this paper, we propose ProcessTransformer, an approach for learning high-level representations from event logs with an attention-based network. Our model incorporates long-range memory and relies on a self-attention mechanism to establish dependencies between a multitude of event sequences and corresponding outputs. We evaluate the applicability of our technique on nine real event logs. We demonstrate that the transformer-based model outperforms several baselines of prior techniques by obtaining on average above 80% accuracy for the task of predicting the next activity. Our method also perform competitively, compared to baselines, for the tasks of predicting event time and remaining time of a running case.

show abstract

“…English Datasets: Machine Reading Comprehension tasks require a machine to answer a question based on the content in the given document. Early MRC datasets are primarily cloze/span-based, where the answer is simply a span in the document or a few words to be filled in the blank, including CNN/Daily Mail [15], LAMBADA [25], CBT [16], BookTest [2], Who-did-What [24] and CLOTH [41]. The famous SQuAD dataset [27,26] for the first time introduces human-generated free-form questions, which requires the machine to understand natural language to select the correct span in Wikipedia pages.…”

Section: Related Workmentioning

confidence: 99%

Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension

Xu¹,

Yi-chen²,

Yi³

et al. 2021

Preprint

View full text Add to dashboard Cite

We present Native Chinese Reader (NCR), a new machine reading comprehension (MRC) dataset with particularly long articles in both modern and classical Chinese. NCR is collected from the exam questions for the Chinese course in China's high schools, which are designed to evaluate the language proficiency of native Chinese youth. Existing Chinese MRC datasets are either domain-specific or focusing on short contexts of a few hundreds of characters in modern Chinese only. By contrast, NCR contains 8390 documents with an average length of 1024 characters covering a wide range of Chinese writing styles, including modern articles, classical literature and classical poetry. A total of 20477 questions on these documents also require strong reasoning abilities and common sense to figure out the correct answers. We implemented multiple baseline models using popular Chinese pretrained models and additionally launched an online competition using our dataset to examine the limit of current methods. The best model achieves 59% test accuracy while human evaluation shows an average accuracy of 79%, which indicates a significant performance gap between current MRC models and native Chinese speakers. We release the dataset at https://sites.google.com/ view/native-chinese-reader/.

show abstract

The LAMBADA dataset: Word prediction requiring a broad discourse context

Cited by 34 publications

References 0 publications

Adversarial NLI: A New Benchmark for Natural Language Understanding

Adversarial NLI: A New Benchmark for Natural Language Understanding

ProcessTransformer: Predictive Business Process Monitoring with Transformer Network

Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension

Contact Info

Product

Resources

About