Haiwen Hong scite author profile

Developing effective distributed representations of source code is fundamental yet challenging for many software engineering tasks such as code clone detection, code search, code translation and transformation. However, current code embedding approaches that represent the semantic and syntax of code in a mixed way are less interpretable and the resulting embedding can not be easily generalized across programming languages. In this paper, we propose a disentangled code representation learning approach to separate the semantic from the syntax of source code under a multi-programming-language setting, obtaining better interpretability and generalizability. Specially, we design three losses dedicated to the characteristics of source code to enforce the disentanglement effectively. We conduct comprehensive experiments on a real-world dataset composed of programming exercises implemented by multiple solutions that are semantically identical but grammatically distinguished. The experimental results validate the superiority of our proposed disentangled code representation, compared to several baselines, across three types of downstream tasks, i.e., code clone detection, code translation, and code-to-code search.

show abstract

Fix-Filter-Fix: Intuitively Connect Any Models for Effective Bug Fixing

Hong¹,

Zhang²,

Zhang³

et al. 2021

View full text Add to dashboard Cite

Locating and fixing bugs is a time-consuming task. Most neural machine translation (NMT) based approaches for automatically bug fixing lack generality and do not make full use of the rich information in the source code. In NMTbased bug fixing, we find some predicted code identical to the input buggy code (called unchanged fix) in NMT-based approaches due to high similarity between buggy and fixed code (e.g., the difference may only appear in one particular line). Obviously, unchanged fix is not the correct fix because it is the same as the buggy code that needs to be fixed. Based on these, we propose an intuitive yet effective general framework (called Fix-Filter-Fix or F 3 ) for bug fixing. F 3 connects models with our filter mechanism to filter out the last model's unchanged fix to the next. We propose an F 3 theory that can quantitatively and accurately calculate the F 3 lifting effect. To evaluate, we implement the Seq2Seq Transformer (ST) and the AST2Seq Transformer (AT) to form some basic F 3 instances, called F 3 ST +AT and F 3 AT +ST . Comparing them with single model approaches and many model connection baselines across four datasets validates the effectiveness and generality of F 3 and corroborates our findings and methodology.

show abstract

Diverse Instance Discovery: Vision-Transformer for Instance-Aware Multi-Label Image Recognition

Hu¹,

Xuan²,

Zhang³

et al. 2022

Preprint

View full text Add to dashboard Cite

DRDF

Hong

Jin

Zhang

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Haiwen Hong

RAMS-Trans: Recurrent Attention Multi-scale Transformer for Fine-grained Image Recognition

Disentangled Code Representation Learning for Multiple Programming Languages

Fix-Filter-Fix: Intuitively Connect Any Models for Effective Bug Fixing

Diverse Instance Discovery: Vision-Transformer for Instance-Aware Multi-Label Image Recognition

DRDF

Contact Info

Product

Resources

About