BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Lewis, Mike; Liu, Yinhan; Goyal, Naman; Ghazvininejad, Marjan; Mohamed, Abdelrahman; Levy, Omer; Stoyanov, Veselin; Zettlemoyer, Luke

doi:10.18653/v1/2020.acl-main.703

Cited by 3,552 publications

(3,370 citation statements)

References 19 publications

Supporting

Mentioning

3,341

Contrasting

Unclassified

Order By: Relevance

“…It has achieved superior performance in machine translation task with significantly less training time. Currently, large Transformers [32,133,149], which are pre-trained on a massive text corpus with self-supervised objectives, have achieved superior results in a variety of downstream NLP tasks such as machine understanding [32,84], question-answering [27,86], and abstractive text summarization [34,72,85,112,148,153]. Zhang et al [153] demonstrated that their pre-trained encoder-decoder model can outperform previous state-of-the-art results [28,36,44,63,67,99,100,119,121] on several datasets by fine-tuning with limited supervised examples, which shows that pre-trained models are promising candidates in zero-shot and low-resource summarization tasks.…”

Section: Beyond Rnn-based Seq2seq Modelsmentioning

confidence: 99%

Neural Abstractive Text Summarization with Sequence-to-Sequence Models

Shi

Keneshloo

Ramakrishnan

et al. 2021

ACM/IMS Trans. Data Sci.

118

View full text Add to dashboard Cite

In the past few years, neural abstractive text summarization with sequence-to-sequence (seq2seq) models have gained a lot of popularity. Many interesting techniques have been proposed to improve seq2seq models, making them capable of handling different challenges, such as saliency, fluency and human readability, and generate high-quality summaries. Generally speaking, most of these techniques differ in one of these three categories: network structure, parameter inference, and decoding/generation. There are also other concerns, such as efficiency and parallelism for training a model. In this article, we provide a comprehensive literature survey on different seq2seq models for abstractive text summarization from the viewpoint of network structures, training strategies, and summary generation algorithms. Several models were first proposed for language modeling and generation tasks, such as machine translation, and later applied to abstractive text summarization. Hence, we also provide a brief review of these models. As part of this survey, we also develop an open source library, namely, Neural Abstractive Text Summarizer (NATS) toolkit, for the abstractive text summarization. An extensive set of experiments have been conducted on the widely used CNN/Daily Mail dataset to examine the effectiveness of several different neural network components. Finally, we benchmark two models implemented in NATS on the two recently released datasets, namely, Newsroom and Bytecup.

show abstract

Section: Beyond Rnn-based Seq2seq Modelsmentioning

confidence: 99%

Neural Abstractive Text Summarization with Sequence-to-Sequence Models

Shi

Keneshloo

Ramakrishnan

et al. 2021

ACM/IMS Trans. Data Sci.

118

View full text Add to dashboard Cite

show abstract

“…We use fairseq-py (Ott et al, 2019) to train the QABRIEFER. We use the open-sourced BART model (Lewis et al, 2019) and suggested finetuning hyperparameters, training for 10 epochs and taking the best epoch by validation loss. To generate, we use beam search with beam size 5.…”

Section: Model Detailsmentioning

confidence: 99%

Generating Fact Checking Briefs

Fan¹,

Piktus²,

Petroni³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Fact checking at scale is difficult-while the number of active fact checking websites is growing, it remains too small for the needs of the contemporary media ecosystem. However, despite good intentions, contributions from volunteers are often error-prone, and thus in practice restricted to claim detection. We investigate how to increase the accuracy and efficiency of fact checking by providing information about the claim before performing the check, in the form of natural language briefs. We investigate passage-based briefs, containing a relevant passage from Wikipedia, entitycentric ones consisting of Wikipedia pages of mentioned entities, and Question-Answering Briefs, with questions decomposing the claim, and their answers. To produce QABriefs, we develop QABRIEFER, a model that generates a set of questions conditioned on the claim, searches the web for evidence, and generates answers. To train its components, we introduce QABRIEFDATASET which we collected via crowdsourcing. We show that fact checking with briefs-in particular QABriefs-increases the accuracy of crowdworkers by 10% while slightly decreasing the time taken. For volunteer (unpaid) fact checkers, QABriefs slightly increase accuracy and reduce the time required by around 20%.

show abstract

“…To improve the quality of the encoder, we incorporate large-scale pretraining on millions of sequences of AMR by adopting the generative pretraining approach proposed in Lewis et al (2019a). This pretraining incorporates various noise operations, such as masking (Devlin et al, 2019), span masking (Fan et al, 2019a), and shuffling.…”

Section: Encoding English Amrmentioning

confidence: 99%

Multilingual AMR-to-Text Generation

Fan¹,

Gardent²

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Generating text from structured data is challenging because it requires bridging the gap between (i) structure and natural language (NL) and (ii) semantically underspecified input and fully specified NL output. Multilingual generation brings in an additional challenge: that of generating into languages with varied word order and morphological properties. In this work, we focus on Abstract Meaning Representations (AMRs) as structured input, where previous research has overwhelmingly focused on generating only into English. We leverage advances in cross-lingual embeddings, pretraining, and multilingual models to create multilingual AMR-to-text models that generate in twenty one different languages. For eighteen languages, based on automatic metrics, our multilingual models surpass baselines that generate into a single language. We analyse the ability of our multilingual models to accurately capture morphology and word order using human evaluation, and find that native speakers judge our generations to be fluent.

show abstract

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Cited by 3,552 publications

References 19 publications

Neural Abstractive Text Summarization with Sequence-to-Sequence Models

Neural Abstractive Text Summarization with Sequence-to-Sequence Models

Generating Fact Checking Briefs

Multilingual AMR-to-Text Generation

Contact Info

Product

Resources

About