2021
DOI: 10.48550/arxiv.2104.08006
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation

Abstract: Now, the pre-training technique is ubiquitous in natural language processing field. Prophet-Net is a pre-training based natural language generation method which shows powerful performance on English text summarization and question generation tasks. In this paper, we extend ProphetNet into other domains and languages, and present the ProphetNet family pretraining models, named ProphetNet-X, where X can be English, Chinese, Multi-lingual, and so on. We pre-train a cross-lingual generation model ProphetNet-Multi,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
19
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 11 publications
(19 citation statements)
references
References 40 publications
0
19
0
Order By: Relevance
“…CPM (Zhang et al, 2020c) maintains a similar model architecture as GPT with 2.6 billion parameters. CPM-2 (Zhang et al, 2021) scales up to 11 billion parameters and employs knowledge inheritance from existing models to accelerate the pre-training process. PanGu-α (Zeng et al, 2021) is a huge model, with up to 200 billion parameters.…”
Section: Large-scale Pre-trained Language Modelsmentioning
confidence: 99%
See 4 more Smart Citations
“…CPM (Zhang et al, 2020c) maintains a similar model architecture as GPT with 2.6 billion parameters. CPM-2 (Zhang et al, 2021) scales up to 11 billion parameters and employs knowledge inheritance from existing models to accelerate the pre-training process. PanGu-α (Zeng et al, 2021) is a huge model, with up to 200 billion parameters.…”
Section: Large-scale Pre-trained Language Modelsmentioning
confidence: 99%
“…Besides the English version, PLATO-2 has one Chinese dialogue model of 363 million parameters, exhibiting prominent improvements over the classical chatbot of XiaoIce . There are some other Chinese dialogue models on a similar modest scale, including CDial-GPT and ProphetNet-X (Qi et al, 2021). Recently, one Chinese dialogue model of EVA (Zhou et al, 2021) is developed under the architecture of Seq2Seq, with up to 2.8 billion parameters.…”
Section: Pre-trained Dialogue Modelsmentioning
confidence: 99%
See 3 more Smart Citations