Neural-machine-translation-based commit message generation: how far are we?

Liu, Zhongxin; Xia, Xin; Hassan, Ahmed E.; Lo, David; Xing, Zhenchang; Wang, Xinyu

doi:10.1145/3238147.3238190

Cited by 182 publications

(174 citation statements)

References 48 publications

Supporting

Mentioning

171

Contrasting

Order By: Relevance

“…This is opposite to the conclusion achieved by Liu et al [23]. One possible reason is that the tasks between ours and Liu et al's [23] are different, i.e., Liu et al aim at producing texts based on code, while we focus on generating texts for dialogues and modeling code is different from modeling dialogue texts [53], [54]. The higher BLEU-4 score of the proposed RRGen model than that of the NMT model explains that the response generated by the RRGen model is more similar to developers' response than the response generated by the NMT model.…”

Section: Evaluation Using An Automatic Metriccontrasting

confidence: 99%

“…All the three metrics are rated on a 1-5 scale (5 for fully satisfying the rating scheme, 1 for completely not satisfying the rating scheme, and 3 for the borderline cases), since a 5-point scale is widely used in prior software engineering studies [3], [23], [58]. Besides the three metrics, each participant is asked to rank responses generated by the three tools and those from developers based on their preference.…”

Section: B Survey Designmentioning

confidence: 99%

See 1 more Smart Citation

Automating App Review Response Generation

Gao

Zeng

Xia

et al. 2019

2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Self Cite

View full text Add to dashboard Cite

Previous studies showed that replying to a user review usually has a positive effect on the rating that is given by the user to the app. For example, Hassan et al. found that responding to a review increases the chances of a user updating their given rating by up to six times compared to not responding. To alleviate the labor burden in replying to the bulk of user reviews, developers usually adopt a template-based strategy where the templates can express appreciation for using the app or mention the company email address for users to follow up. However, reading a large number of user reviews every day is not an easy task for developers. Thus, there is a need for more automation to help developers respond to user reviews.Addressing the aforementioned need, in this work we propose a novel approach RRGen that automatically generates review responses by learning knowledge relations between reviews and their responses. RRGen explicitly incorporates review attributes, such as user rating and review length, and learns the relations between reviews and corresponding responses in a supervised way from the available training data. Experiments on 58 apps and 309,246 review-response pairs highlight that RRGen outperforms the baselines by at least 67.4% in terms of BLEU-4 (an accuracy measure that is widely used to evaluate dialogue response generation systems). Qualitative analysis also confirms the effectiveness of RRGen in generating relevant and accurate responses.

show abstract

Section: Evaluation Using An Automatic Metriccontrasting

confidence: 99%

Section: B Survey Designmentioning

confidence: 99%

Automating App Review Response Generation

Gao

Zeng

Xia

et al. 2019

2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Self Cite

View full text Add to dashboard Cite

show abstract

“…After applying several filters, Jiang et al obtain a set of 32 commits that could be used by NMT1 algorithm [7]. Liu et al made a cleaned version of this dataset by removing the noisy messages [11] (cleaned dataset). The noisy messages are categorized into two categories: (1) The messages generated by development tools (called bot messages).…”

Section: Datasetmentioning

confidence: 99%

“…A higher BLEU_4 score for a generated message shows that the message is more similar to the one written by the human developer. Liu et al [11] also consider this metric as a textual similarity distance metric used in the second step of NNGen described above.…”

Section: Introductionmentioning

confidence: 99%

On the Relevance of Cross-project Learning with Nearest Neighbours for Commit Message Generation

Etemadi

Monperrus

2020

Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops

View full text Add to dashboard Cite

Commit messages play an important role in software maintenance and evolution. Nonetheless, developers often do not produce highquality messages. A number of commit message generation methods have been proposed in recent years to address this problem. Some of these methods are based on neural machine translation (NMT) techniques. Studies show that the nearest neighbor algorithm (NNGen) outperforms existing NMT-based methods, although NNGen is simpler and faster than NMT. In this paper, we show that NNGen does not take advantage of cross-project learning in the majority of the cases. We also show that there is an even simpler and faster variation of the existing NNGen method which outperforms it in terms of the BLEU_4 score without using cross-project learning. CCS CONCEPTS • Software and its engineering → Software maintenance tools.

show abstract

“…On the other hand, the text content (e.g., class/method/variable names, comments) of programs, which is used in tasks like code recommendation and program comprehension, is often not expressed in a consistent and normative way. For example, a recent study by Liu et al [8] revealed that a large part of the commit messages that are used as references are noisy, for example they may be bot messages generated by tools or trivial messages containing little or redundant information.…”

mentioning

confidence: 99%