Source code summarization is the task of creating short, natural language descriptions of source code. Code summarization is the backbone of much software documentation such as JavaDocs, in which very brief comments such as "adds the customer object" help programmers quickly understand a snippet of code. In recent years, automatic code summarization has become a high value target of research, with approaches based on neural networks making rapid progress. However, as we will show in this paper, the production of good summaries relies on the production of the action word in those summaries: the meaning of the example above would be completely changed if "removes" were substituted for "adds." In this paper, we advocate for a special emphasis on action word prediction as an important stepping stone problem towards better code summarizationcurrent techniques try to predict the action word along with the whole summary, and yet action word prediction on its own is quite difficult. We show the value of the problem for code summaries, explore the performance of current baselines, and provide recommendations for future research.
Virtual Assistant technology is rapidly proliferating to improve productivity in a variety of tasks. While several virtual assistants for everyday tasks are well-known (e.g., Siri, Cortana, Alexa), assistants for specialty tasks such as software engineering are rarer. One key reason software engineering assistants are rare is that very few experimental datasets are available and suitable for training the AI that is the bedrock of current virtual assistants. In this paper, we present a set of Wizard of Oz experiments that we designed to build a dataset for creating a virtual assistant. Our target is a hypothetical virtual assistant for helping programmers use APIs. In our experiments, we recruited 30 professional programmers to complete programming tasks using two APIs. The programmers interacted with a simulated virtual assistant for help -the programmers were not aware that the assistant was actually operated by human experts. We then annotated the dialogue acts in the corpus along four dimensions: illocutionary intent, API information type(s), backward-facing function, and traceability to specific API components. We observed a diverse range of interactions that will facilitate the development of dialogue strategies for virtual assistants for API usage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.