Code Prediction by Feeding Trees to Transformers

Kim, Seohyun; Zhao, Jinman; Tian, Yun; Chandra, Satish

doi:10.1109/icse43902.2021.00026

Cited by 127 publications

(109 citation statements)

References 38 publications

Supporting

Mentioning

103

Contrasting

Order By: Relevance

“…In traditional grammar-based generation of text [7] or code [21,38,2,5], the CFG is followed by sequentially expanding the left-most, bottom-most non-terminal symbol, using one of the production rules in R. GRAMMFORMER changes this and instead selects which (if any) non-terminal symbol to expand. Similar to recent works [38,5,16], GRAMMFORMER loosens the CFG assumptions but retains many aspects, discussed next. Alg.…”

Section: Grammformermentioning

confidence: 52%

“…One of the most successful applications of LMCs is code completion [33,15] and transformer language models have been recently shown exceptional performance at the task being able to predict relatively long sequences of code tokens [34]. Grammar-based code completion and generation has been researched with neural [21,38,16] and non-neural models [5], always expanding the left-most, bottom-most non-terminal. In contrast to GRAMMFORMERs, all these code completion models target the generation of complete code without the ability to create sketches.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Learning to Complete Code with Sketches

Guo¹,

Svyatkovskiy²,

Yin³

et al. 2021

Preprint

View full text Add to dashboard Cite

Traditional generative models are limited to predicting sequences of terminal tokens. However, ambiguities in the generation task may lead to incorrect outputs. Towards addressing this, we introduce GRAMMFORMERs, transformer-based grammarguided models that learn (without explicit supervision) to generate sketches -sequences of tokens with holes. Through reinforcement learning, GRAMMFORMERs learn to introduce holes avoiding the generation of incorrect tokens where there is ambiguity in the target task. We train GRAMMFORMERs for statement-level source code completion, i.e. the generation of code snippets given an ambiguous user intent, such as a partial code context. We evaluate GRAMMFORMERs on code completion for C# and Python and show that it generates 10-50% more accurate sketches compared to traditional generative models and 37-50% longer sketches compared to sketch-generating baselines trained with similar techniques.Preprint. Under review.

show abstract

Section: Grammformermentioning

confidence: 52%

Section: Related Workmentioning

confidence: 99%

Learning to Complete Code with Sketches

Guo¹,

Svyatkovskiy²,

Yin³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…V. RELATED WORK Abstract Syntax Trees (AST) have been used extensively in the literature [6], [7], [14]- [16], [21], [28]. In their work, Zhang et al [14] use sub-trees extracted from the AST with tree-based CNN to generate code vectors.…”

Section: Evaluation and Resultsmentioning

confidence: 99%

A Mocktail of Source Code Representations

Swarna¹,

Mathews²,

Vagavolu³

et al. 2021

Preprint

View full text Add to dashboard Cite

Efficient representation of source code is essential for various software engineering tasks such as code search and code clone detection. One such technique for representing source code involves extracting paths from the AST and using a learning model to capture program properties. Code2vec is a commonly used path-based approach that uses an attention-based neural network to learn code embeddings which can then be used for various software engineering tasks. However, this approach uses only ASTs and does not leverage other graph structures such as Control Flow Graphs (CFG) and Program Dependency Graphs (PDG). Similarly, most recent approaches for representing source code still use AST and do not leverage semantic graph structures. Even though there exists an integrated graph approach (Code Property Graph) for representing source code, it has only been explored in the domain of software security. Moreover, it does not leverage the paths from the individual graphs. In our work, we extend the path-based approach code2vec to include semantic graphs, CFG, and PDG, along with AST, which is still largely unexplored in the domain of software engineering. We evaluate our approach on the task of METHODNAMING using a custom C dataset of 730K methods collected from 16 C projects from GitHub. In comparison to code2vec, our approach improves the F1 Score by 11% on the full dataset and up to 100% with individual projects. We show that semantic features from the CFG and PDG paths are indeed helpful. We envision that looking at a mocktail of source code representations for various software engineering tasks can lay the foundation for a new line of research and a re-haul of existing research.

show abstract

“…coding [Kim et al, 2021], and VR interaction [David-John et al, 2021]. These approaches, however, have posed problems since their conception because they can be frustrating or detrimental to completing tasks if the computer cannot correctly predict what the user intended [Olteanu et al, 2020;Yang et al, 2020].…”

Section: Dimension 2: Computer Assistancementioning

confidence: 99%

Two-In-One: A Design Space for Mapping Unimanual Input into Bimanual Interactions in VR for Users with Limited Movement

Yamagami

Junuzovic

González-Franco

et al. 2022

ACM Trans. Access. Comput.

View full text Add to dashboard Cite

Virtual Reality (VR) applications often require users to perform actions with two hands when performing tasks and interacting with objects in virtual environments. Although bimanual interactions in VR can resemble real-world interactions—thus increasing realism and improving immersion—they can also pose significant accessibility challenges to people with limited mobility, such as for people who have full use of only one hand. An opportunity exists to create accessible techniques that take advantage of users’ abilities, but designers currently lack structured tools to consider alternative approaches. To begin filling this gap, we propose Two-in-One, a design space that facilitates the creation of accessible methods for bimanual interactions in VR from unimanual input. Our design space comprises two dimensions, bimanual interactions and computer assistance, and we provide a detailed examination of issues to consider when creating new unimanual input techniques that map to bimanual interactions in VR. We used our design space to create three interaction techniques that we subsequently implemented for a subset of bimanual interactions and received user feedback through a video elicitation study with 17 people with limited mobility. Our findings explore complex tradeoffs associated with autonomy and agency and highlight the need for additional settings and methods to make VR accessible to people with limited mobility.

show abstract

Code Prediction by Feeding Trees to Transformers

Cited by 127 publications

References 38 publications

Learning to Complete Code with Sketches

Learning to Complete Code with Sketches

A Mocktail of Source Code Representations

Two-In-One: A Design Space for Mapping Unimanual Input into Bimanual Interactions in VR for Users with Limited Movement

Contact Info

Product

Resources

About