2020
DOI: 10.48550/arxiv.2001.04063
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 37 publications
(29 citation statements)
references
References 21 publications
0
29
0
Order By: Relevance
“…We perform further pre-training on 160GB unlabeled English corpus, including news, books, stories and web text. It is similar to the corpus of well-known AR pre-training works such as Prophet-Net (Qi et al, 2020) and BART (Lewis et al, 2019). The learning rate is set to 4e-4, 366k steps, batch size 2048, distillation weight α 0.5 on 16 32GB memory NVIDIA Tesla V100 GPUs.…”
Section: Pre-training Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…We perform further pre-training on 160GB unlabeled English corpus, including news, books, stories and web text. It is similar to the corpus of well-known AR pre-training works such as Prophet-Net (Qi et al, 2020) and BART (Lewis et al, 2019). The learning rate is set to 4e-4, 366k steps, batch size 2048, distillation weight α 0.5 on 16 32GB memory NVIDIA Tesla V100 GPUs.…”
Section: Pre-training Resultsmentioning
confidence: 99%
“…Consider the sequence to sequence generation scenario, we denote the input and output sequence as (x, y). For a typical neural sequence generation model, i.e., (Lewis et al, 2019;Song et al, 2019;Qi et al, 2020), it encodes the input sequence x into dense representation h in Eqn. 1, and decodes a sequence of tokens as output y :…”
Section: Non-autoregressive Generationmentioning
confidence: 99%
See 2 more Smart Citations
“…This work draws on our rich experience in summarizing meeting conversations Liu, 2009, 2013;Koay et al, 2020) and building neural abstractive systems (Lebanoff et al, 2019(Lebanoff et al, , 2020Song et al, 2020). We have chosen an abstractive system over its extractive counterpart for this task, as neural abstractive systems have seen significant progress (Raffel et al, 2019;Lewis et al, 2020;Qi et al, 2020). Not only can an abstract accurately convey the content of the podcast, but it is in a succinct form that is easy to read on a smartphone.…”
Section: Our Summarymentioning
confidence: 99%