Using Transfer Learning for Code-Related Tasks

Mastropaolo, Antonio; Cooper, Nathan; Palacio, David N.; Scalabrino, Simone; Poshyvanyk, Denys; Oliveto, Rocco; Bavota, Gabriele

doi:10.48550/arxiv.2206.08574

2022

DOI: 10.48550/arxiv.2206.08574

|View full text |Cite

Preprint

Using Transfer Learning for Code-Related Tasks

Antonio Mastropaolo¹,

Nathan Cooper²,

David N. Palacio³

et al.

Abstract: Deep learning (DL) techniques have been used to support several code-related tasks such as code summarization and bug-fixing. In particular, pre-trained transformer models are on the rise, also thanks to the excellent results they achieved in Natural Language Processing (NLP) tasks. The basic idea behind these models is to first pre-train them on a generic dataset using a self-supervised task (e.g., filling masked words in sentences). Then, these models are fine-tuned to support specific tasks of interest (e.g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 61 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

LoGenText-Plus : Improving Neural Machine Translation Based Logging Texts Generation with Syntactic Templates

Ding,

Tang,

Cheng

et al. 2023

ACM Trans. Softw. Eng. Methodol.

View full text Add to dashboard Cite

Developers insert logging statements in the source code to collect important runtime information about software systems. The textual descriptions in logging statements (i.e., logging texts) are printed during system executions and exposed to multiple stakeholders including developers, operators, users, and regulatory authorities. Writing proper logging texts is an important but often challenging task for developers. Prior studies find that developers spend significant efforts modifying their logging texts. However, despite extensive research on automated logging suggestions, research on suggesting logging texts rarely exists. To fill this knowledge gap, we first propose LoGenText , reported in our conference paper (Ding et al., 2022), an automated approach that uses neural machine translation models to generate logging texts by translating the related source code into short textual descriptions. LoGenText takes the preceding source code of a logging text as the input and considers other context information such as the location of the logging statement, to automatically generate the logging text. The LoGenText ’s evaluation on 10 open-source projects indicates that the approach is promising for automatic logging text generation and significantly outperforms the state-of-the-art approach. Furthermore, we extend LoGenText to LoGenText-Plus by incorporating the syntactic templates of the logging texts. Different from LoGenText , LoGenText-Plus decomposes the logging text generation process into two stages. LoGenText-Plus first adopts a neural machine translation model to generate the syntactic template of the target logging text. Then LoGenText-Plus feeds the source code and the generated template as the input to another neural machine translation model for logging text generation. We also evaluate LoGenText-Plus on the same 10 projects and observe that it outperforms LoGenText on nine of them. According to a human evaluation from developers’ perspectives, the logging texts generated by LoGenText-Plus have a higher quality than those generated by LoGenText and the prior baseline approach. By manually examining the generated logging texts, we then identify five aspects that can serve as guidance for writing or generating good logging texts. Our work is an important step towards the automated generation of logging statements, which can potentially save developers’ efforts and improve the quality of software logging. Our findings shed light on research opportunities that leverage advances in neural machine translation techniques for automated generation and suggestion of logging statements.

show abstract

LoGenText-Plus : Improving Neural Machine Translation Based Logging Texts Generation with Syntactic Templates

Ding,

Tang,

Cheng

et al. 2023

ACM Trans. Softw. Eng. Methodol.

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Using Transfer Learning for Code-Related Tasks

Cited by 1 publication

References 61 publications

LoGenText-Plus : Improving Neural Machine Translation Based Logging Texts Generation with Syntactic Templates

LoGenText-Plus : Improving Neural Machine Translation Based Logging Texts Generation with Syntactic Templates

Contact Info

Product

Resources

About