Despite significant existing empirical work, little is known about the specific kinds of questions programmers ask when evolving a code base. Understanding precisely what information a programmer needs about the code base as they work is key to determining how to better support the activity of programming. The goal of this research is to provide an empirical foundation for tool design based on an exploration of what programmers need to understand about a code base and of how they use tools to discover that information. To this end, we 1 Introduction 1 1.1 Overview of Our Data Collection and Analysis ' 1.2 Overview of Contributions 1.3 Overview of Dissertation Contents 2 Related Work 2.1 Program Comprehension 2.1.1 Cognitive Models 2.1.2 Informing Tool Design 2.1.3 Analysis of Questions 2.2 Empirical Studies of Change Tasks 2.3 Summary iii 73 6.2 Eclipse Search Results View 73 6.3 SHriMP and Relo Visualizations 84 6.4 Eclipse Call Hierarchy 6.5 Sample output from the diff command line tools 7.1 Mockup of a search tool illustrating results 7.2 Distribution of question occurrences across studies by category 103 vii Research Council of Canada (NSERC), IBM and Intel. viii Dedication To Christina Jolayne for her patient and enthusiastic support through far too many years of school. ix Chapter 1 subroutine/procedure invoked? Johnson and Erdem studied questions asked to experts on a newsgroup [47]. Similarly, little is known about how programmers answer their questions and the role of tools in that process. Our aim is to build on this body of existing work with the goal of providing an empirical foundation for tool design based on an exploration of what programmers need to understand and of how they use tools to discover that information. We focus, in particular, on what programmers need to understand about a code base while performing a nontrivial change task to a software system. To this end, we undertook two qualitative studies. In each of these studies we observed programmers making source changes to medium (20 KLOC) to large-sized (over 1 million LOC) code bases. Based on a systematic analysis of the data from We collected data from two studies: our first study was carried out in a laboratory setting [85] and the second study was carried out in an industrial work setting [84]. Both were observational studies to which we applied qualitative analysis. The participants in the first study (NI... N9) we refer to as newcomers as they were working on a code base that was new to them. All nine participants in the first study were computer science graduate students with varying amounts of previous development experience, including experience with the Java programming language. In this first study, pairs of programmers performed change tasks on a moderately sized open-source system assigned by the experimenter. We chose to study pairs of programmers because we believed that the discussion between the pair as they worked on the change task would allow us to learn what information they were looking for and why particular action...
Though many tools are available to help programmers working on change tasks, and several studies have been conducted to understand how programmers comprehend systems, little is known about the specific kinds of questions programmers ask when evolving a code base. To fill this gap we conducted two qualitative studies of programmers performing change tasks to medium to large sized programs. One study involved newcomers working on assigned change tasks to a medium-sized code base. The other study involved industrial programmers working on their own change tasks on code with which they had experience. The focus of our analysis has been on what information a programmer needs to know about a code base while performing a change task and also on how they go about discovering that information. Based on this analysis we catalog and categorize 44 different kinds of questions asked by our participants. We also describe important context for how those questions were answered by our participants, including their use of tools.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.