2022
DOI: 10.48550/arxiv.2207.11280
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

Abstract: We present PANGU-CODER, a pretrained decoder-only language model adopting the PANGU-α architecture for text-to-code generation, i.e. the synthesis of programming language solutions given a natural language problem description. We train PANGU-CODER using a two-stage strategy: the first stage employs Causal Language Modelling (CLM) to pre-train on raw programming language data, while the second stage uses a combination of Causal Language Modelling and Masked Language Modelling (MLM) training objectives that focu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 24 publications
0
7
0
Order By: Relevance
“…It introduces an input mechanism that employs distinct embeddings for distinct domains, which is coupled with a two-level routing design in the Random Routed Experts (RRE) framework. The pretraining corpus of Pangu-Σ, totalling 329 billion tokens, primarily encompasses diverse data formats of bilingual Chines-English, content from [24,47,48] and code from [49,50].…”
Section: Pangu-σ (2023)mentioning
confidence: 99%
See 1 more Smart Citation
“…It introduces an input mechanism that employs distinct embeddings for distinct domains, which is coupled with a two-level routing design in the Random Routed Experts (RRE) framework. The pretraining corpus of Pangu-Σ, totalling 329 billion tokens, primarily encompasses diverse data formats of bilingual Chines-English, content from [24,47,48] and code from [49,50].…”
Section: Pangu-σ (2023)mentioning
confidence: 99%
“…Chinese companies adopt an ecosystem-diverse and systematic approach, typically rolling out a series of models to create a holistic technological ecosystem. Examples include Baidu's ERNIE series [164][165][166] of large models and its derivative ERNIEBot, Huawei's PanGu series [10,11,49] model, Alibaba's Tongyi series, Tencent's HunYuan series. Additionally, Chinese universities actively participate in large model development and research, partner-ing with tech companies or independently creating multiple large models, with Tsinghua University's CPM [36,37,167] series and GLM [168] being one such instance.…”
Section: Comparative Analysis From the View Of Globalizationmentioning
confidence: 99%
“…Recent research has delved into leveraging pretrained large language models (LLMs) from the natural language processing (NLP) field to automate program synthesis tasks, using vast-scale code corpus data mined from open-source repositories. Notably, there are several prominent examples of such pretrained models including the encoder-only CodeBERT (Feng et al, 2020), decoder-only CodeGPT (Lu et al, 2021), Code-Gen (Nijkamp et al, 2022), PaLM-Coder (Chowdhery et al, 2022), PanGu-Coder (Christopoulou et al, 2022), CodeGeex (Zheng et al, 2023), andSantaCoder (Allal et al, 2023), as well as encoder-decoder transformer architectures like PLABRT (Ahmad et al, 2021) and CodeT5 (Wang et al, 2021). These pretrained probabilistic language (PL) models are already capable of generating code that appears visually impressive and well-structured.…”
Section: Pretrained Llms For Program Synthesismentioning
confidence: 99%
“…AlphaCode , for instance, aspires to address competitive-level programming challenges, while InCoder (Fried et al, 2022) enables code insertion at arbitrary junctures utilizing bidirectional contexts. Other acclaimed models include CodeT5 (Wang et al, 2021), CodeGen (Nijkamp et al, 2022), PaLM-Coder (Chowdhery et al, 2022), PanGu-Coder (Christopoulou et al, 2022), CodeGeex (Zheng et al, 2023), andSantaCoder (Allal et al, 2023). As the size of these LLMs increases, they demonstrate emergent competencies, including human-like programming prowess and debugging aptitude Saunders et al, 2022).…”
Section: Introductionmentioning
confidence: 99%
“…F. Xu et al, 2022]. In particular, large language models (LLMs) have been successful in a variety of code generation tasks [Athiwaratkun et al, 2023;Austin et al, 2021;Cassano, Gouwar, et al, 2023;Christopoulou et al, 2022;Izadi et al, 2022;Nijkamp et al, 2023;F. F. Xu et al, 2022].…”
Section: Introductionmentioning
confidence: 99%