Large APIs can be hard to learn, and this can lead to decreased programmer productivity. But what makes APIs hard to learn? We conducted a mixed approach, multi-phased study of the obstacles faced by Microsoft developers learning a wide variety of new APIs. The study involved a combination of surveys and in-person interviews, and collected the opinions and experiences of over 440 professional developers. We found that some of the most severe obstacles faced by developers learning new APIs pertained to the documentation and other learning resources. We report on the obstacles developers face when learning new APIs, with a special focus on obstacles related to API documentation. Our qualitative analysis elicited five important factors to consider when designing API documentation: documentation of intent; code examples; matching APIs with scenarios; penetrability of the API; and format and presentation. We analyzed how these factors can be interpreted to prioritize API documentation development efforts
Abstract-Prior to performing a software change task, developers must discover and understand the subset of the system relevant to the task. Since the behavior exhibited by individual developers when investigating a software system is influenced by intuition, experience, and skill, there is often significant variability in developer effectiveness. To understand the factors that contribute to effective program investigation behavior, we conducted a study of five developers performing a change task on a medium-size open source system. We isolated the factors related to effective program investigation behavior by performing a detailed qualitative analysis of the program investigation behavior of successful and unsuccessful developers. We report on these factors as a set of detailed observations, such as evidence of the phenomenon of inattention blindness by developers skimming source code. In general, our results support the intuitive notion that a methodical and structured approach to program investigation is the most effective.
Many program evolution tasks involve source code that is not modularized as a single unit. Furthermore, the source code relevant to a change task often implements different concerns, or high-level concepts that a developer must consider. Finding and understanding concerns scattered in source code is a difficult task that accounts for a large proportion of the effort of performing program evolution. One possibility to mitigate this problem is to produce textual documentation that describes scattered concerns. However, this approach is impractical because it is costly, and because, as a program evolves, the documentation becomes inconsistent with the source code. The thesis of this dissertation is that a description of concerns, representing program structures and linked to source code, that can be produced cost-effectively during program investigation activities, can help developers perform software evolution tasks more systematically, and on different versions of a system. To validate the claims of this thesis, we have developed a model for a structure, called concern graph, that describes concerns in source code in terms of relations between program elements. The model also defines precisely the notion of inconsistency between a concern graph and the corresponding source code, so that it is possible to automatically detect and repair inconsistencies between a description of source code and an actual code base. To experiment with concern graphs, we have developed a tool, called FEAT, that allows developers to iteratively build concern graphs when investigating source code, to view the code related to a concern, and to perform analyses on a concern representation. Using FEAT, we have evaluated the cost and usefulness of concern graphs in a series of case studies involving the evolution of five systems of different size and style. The results show that concern graphs are inexpensive to create during program investigation, can help developers perform program evolution tasks more systematically, and are robust enough to represent concerns in different versions of a system.
Exception-handling mechanisms in modern programming languages provide a means to help software developers build robust applications by separating the normal control flow of a program from the control flow of the program under exceptional situations. Separating the exceptional structure from the code associated with normal operations bears some consequences. One consequence is that developers wishing to improve the robustness of a program must figure out which exceptions, if any, can flow to a point in the program. Unfortunately, in large programs, this exceptional control flow can be difficult, if not impossible, to determine.In this article, we present a model that encapsulates the minimal concepts necessary for a developer to determine exception flow for object-oriented languages that define exceptions as objects. Using these concepts, we describe why exception-flow information is needed to build and evolve robust programs. We then describe Jex, a static analysis tool we have developed to provide exception-flow information for Java systems based on this model. The Jex tool provides a view of the actual exception types that might arise at different program points and of the handlers that are present. Use of this tool on a collection of Java library and application source code demonstrates that the approach can be helpful to support both local and global improvements to the exception-handling structure of a system.
Software developers need access to different kinds of information which is often dispersed among different documentation sources, such as API documentation or Stack Overflow. We present an approach to automatically augment API documentation with "insight sentences" from Stack Overflowsentences that are related to a particular API type and that provide insight not contained in the API documentation of that type. Based on a development set of 1,574 sentences, we compare the performance of two state-of-the-art summarization techniques as well as a pattern-based approach for insight sentence extraction. We then present SISE, a novel machine learning based approach that uses as features the sentences themselves, their formatting, their question, their answer, and their authors as well as part-of-speech tags and the similarity of a sentence to the corresponding API documentation. With SISE, we were able to achieve a precision of 0.64 and a coverage of 0.7 on the development set. In a comparative study with eight software developers, we found that SISE resulted in the highest number of sentences that were considered to add useful information not found in the API documentation. These results indicate that taking into account the meta data available on Stack Overflow as well as part-of-speech tags can significantly improve unsupervised extraction approaches when applied to Stack Overflow data.
Reading reference documentation is an important part of programming with APIs. Reference documentation complements the API by providing information not obvious from the syntax of the API. To improve the quality of reference documentation and the efficiency with which the relevant information it contains can be accessed, we must first understand its content. We report on a study of the nature and organization of knowledge contained in the reference documentation of the hundreds of APIs provided as part of two major technology platforms: Java SDK 6 and .NET 4.0. Our study involved the development of a taxonomy of knowledge types based on grounded methods and independent empirical validation. Seventeen trained coders used the taxonomy to rate a total of 5574 randomly-sampled documentation units to assess the knowledge they contain. Our results provide a comprehensive perspective on the patterns of knowledge in API documentation: observations about the types of knowledge it contains, and how this knowledge is distributed throughout the documentation. The taxonomy and patterns of knowledge we present in this paper can be used to help practitioners evaluate the content of their API documentation, better organize their documentation, and limit the amount of low-value content. They also provides a vocabulary that can help structure and facilitate discussions about the content of APIs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.