2021
DOI: 10.48550/arxiv.2109.00859
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation

Abstract: Pre-trained models for Natural Languages (NL) like BERT and GPT have been recently shown to transfer well to Programming Languages (PL) and largely benefit a broad set of code-related tasks. Despite their success, most current methods either rely on an encoder-only (or decoder-only) pre-training that is suboptimal for generation (resp. understanding) tasks or process the code snippet in the same way as NL, neglecting the special characteristics of PL such as token types. We present CodeT5, a unified pre-traine… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
71
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 55 publications
(103 citation statements)
references
References 19 publications
(9 reference statements)
0
71
0
Order By: Relevance
“…However, some pretrained models for Natural Languages (NL) like BERT [12] and GPT-3 [8] have recently demonstrated excellent transferability to Programming Languages (PL) and stronger capabilities of capturing semantics information than code2vec or code2seq. Inspired by the success of these language models, pre-trained models of code have recently become more and more popular in the field of code intelligence and benefited a broad range of tasks [9,14,20,26,42,46]. These current pre-trained models of code can be divided into two types: embedding models and generative models.…”
Section: Pre-trained Models Of Codementioning
confidence: 99%
“…However, some pretrained models for Natural Languages (NL) like BERT [12] and GPT-3 [8] have recently demonstrated excellent transferability to Programming Languages (PL) and stronger capabilities of capturing semantics information than code2vec or code2seq. Inspired by the success of these language models, pre-trained models of code have recently become more and more popular in the field of code intelligence and benefited a broad range of tasks [9,14,20,26,42,46]. These current pre-trained models of code can be divided into two types: embedding models and generative models.…”
Section: Pre-trained Models Of Codementioning
confidence: 99%
“…To probe the limits of successful transfer further, we next asked whether pretraining with programming languages, as opposed to natural languages, would also improve generalization in downstream semantic parsing tasks. To test this, we took a recent language model called CodeT5 (denoted by ct5 base in Tables 1-2), which was pretrained predominantly on several different programming languages (Wang et al, 2021). We note that the pretraining data for this model involved some amount of natural language as well, however, so the model was not pretrained exclusively with programming languages (for more details on the pretraining data and the pretraining tasks for this model, please see Wang et al (2021)).…”
Section: Resultsmentioning
confidence: 99%
“…To test this, we took a recent language model called CodeT5 (denoted by ct5 base in Tables 1-2), which was pretrained predominantly on several different programming languages (Wang et al, 2021). We note that the pretraining data for this model involved some amount of natural language as well, however, so the model was not pretrained exclusively with programming languages (for more details on the pretraining data and the pretraining tasks for this model, please see Wang et al (2021)). Remarkably, CodeT5 also substantially improved generalization in both SCAN and COGS (Tables 1-2).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The pretraining objectives used include masked language modeling, code structure edges, and representation alignment between source code and code structure. Other pretrained transformers used on source code include CodeT5 (Wang et al, 2021b). Code-Trans (Elnaggar et al, 2021), PyMT5 (Clement et al, 2020), CuBERT (Kanade et al, 2020), PLBART , ProphetNet-X (Qi et al, 2021), CoTexT (Phan et al, 2021), T5-Code (Mastropaolo et al, 2021), GraphCode-BERT , and AlphaCode (Li et al, 2022).…”
Section: Pretrained Transformer Modelsmentioning
confidence: 99%