Stas Negara scite author profile

Abstract. Researchers use file-based Version Control System (VCS) as the primary source of code evolution data. VCSs are widely used by developers, thus, researchers get easy access to historical data of many projects. Although it is convenient, research based on VCS data is incomplete and imprecise. Moreover, answering questions that correlate code changes with other activities (e.g., test runs, refactoring) is impossible. Our tool, CodingTracker, non-intrusively records fine-grained and diverse data during code development. CodingTracker collected data from 24 developers: 1,652 hours of development, 23,002 committed files, and 314,085 testcase runs. This allows us to answer: How much code evolution data is not stored in VCS? How much do developers intersperse refactorings and edits in the same commit? How frequently do developers fix failing tests by changing the test itself? How many changes are committed to VCS without being tested? What is the temporal and spacial locality of changes?

show abstract

Mining fine-grained code changes to detect unknown change patterns

Negara

Codoban

Dig

et al. 2014

View full text Add to dashboard Cite

Identifying repetitive code changes benefits developers, tool builders, and researchers. Tool builders can automate the popular code changes, thus improving the productivity of developers. Researchers can better understand the practice of code evolution, advancing existing code assistance tools and benefiting developers even further. Unfortunately, existing research either predominantly uses coarse-grained Version Control System (VCS) snapshots as the primary source of code evolution data or considers only a small subset of program transformations of a single kind -refactorings.We present the first approach that identifies previously unknown frequent code change patterns from a fine-grained sequence of code changes. Our novel algorithm effectively handles challenges that distinguish continuous code change pattern mining from the existing data mining techniques. We evaluated our algorithm on 1,520 hours of code development collected from 23 developers, and showed that it is effective, useful, and scales to large amounts of data. We analyzed some of the mined code change patterns and discovered ten popular kinds of high-level program transformations. More than half of our 420 survey participants acknowledged that eight out of ten transformations are relevant to their programming activities.

show abstract

An empirical evaluation and comparison of manual and automated test selection

Gligoric

Negara

Legunsen

et al. 2014

View full text Add to dashboard Cite

Regression test selection speeds up regression testing by rerunning only the tests that can be affected by the most recent code changes. Much progress has been made on research in automated test selection over the last three decades, but it has not translated into practical tools that are widely adopted. Therefore, developers either re-run all tests after each change or perform manual test selection. Re-running all tests is expensive, while manual test selection is tedious and error-prone. Despite such a big trade-off, no study assessed how developers perform manual test selection and compared it to automated test selection.This paper reports on our study of manual test selection in practice and our comparison of manual and automated test selection. We are the first to conduct a study that (1) analyzes data from manual test selection, collected in real time from 14 developers during a three-month study and (2) compares manual test selection with an automated state-of-the-research test-selection tool for 450 test sessions.Almost all developers in our study performed manual test selection, and they did so in mostly ad-hoc ways. Comparing manual and automated test selection, we found the two approaches to select different tests in each and every one of the 450 test sessions investigated. Manual selection chose more tests than automated selection 73% of the time (potentially wasting time) and chose fewer tests 27% of the time (potentially missing bugs). These results show the need for better automated test-selection techniques that integrate well with developers' programming environments.

show abstract

Automatic MPI to AMPI Program Transformation Using Photran

Negara

Zheng

Pan

et al. 2011

View full text Add to dashboard Cite

Abstract. Adaptive MPI, or AMPI, is an implementation of the Message Passing Interface (MPI) standard. AMPI benefits MPI applications with features such as dynamic load balancing, virtualization, and checkpointing. Because AMPI uses multiple user-level threads per physical core, global variables become an obstacle. It is thus necessary to convert MPI programs to AMPI by eliminating global variables. Manually removing the global variables in the program is tedious and error-prone. In this paper, we present a Photran-based tool that automates this task with a source-to-source transformation that supports Fortran. We evaluate our tool on the multi-zone NAS Benchmarks with AMPI. We also demonstrate the tool on a real-world large-scale FLASH code and present preliminary results of running FLASH on AMPI. Both results show significant performance improvement using AMPI. This demonstrates that the tool makes using AMPI easier and more productive.

show abstract

Inferring ownership transfer for efficient message passing

2011

View full text Add to dashboard Cite

One of the more popular paradigms for concurrent programming is the Actor model of message passing; it has been adopted in one form or another by a number of languages and frameworks. By avoiding a shared local state and instead relying on message passing, the Actor model facilitates modular programming. An important challenge for message passing languages is to transmit messages efficiently. This requires retaining the pass-by-value semantics of messages while avoiding making a deep copy on sequential or shared memory multicore processors. A key observation is that many messages have an ownership transfer semantics; such messages can be sent efficiently using pointers without introducing shared state between concurrent objects. We propose a conservative static analysis algorithm which infers if the content of a message is compatible with an ownership transfer semantics. Our tool, called SOTER (for Safe Ownership Transfer enablER) transforms the program to avoid the cost of copying the contents of a message whenever it can infer the content obeys the ownership transfer semantics. Experiments using a range of programs suggest that our conservative static analysis method is usually able to infer ownership transfer. Performance results demonstrate that the transformed programs execute up to an order of magnitude faster than the original programs.

show abstract

Inferring ownership transfer for efficient message passing

Negara

Karmani

Agha

2011

View full text Add to dashboard Cite

One of the more popular paradigms for concurrent programming is the Actor model of message passing; it has been adopted in one form or another by a number of languages and frameworks. By avoiding a shared local state and instead relying on message passing, the Actor model facilitates modular programming. An important challenge for message passing languages is to transmit messages efficiently. This requires retaining the pass-by-value semantics of messages while avoiding making a deep copy on sequential or shared memory multicore processors. A key observation is that many messages have an ownership transfer semantics; such messages can be sent efficiently using pointers without introducing shared state between concurrent objects. We propose a conservative static analysis algorithm which infers if the content of a message is compatible with an ownership transfer semantics. Our tool, called SOTER (for Safe Ownership Transfer enablER 1 ) transforms the program to avoid the cost of copying the contents of a message whenever it can infer the content obeys the ownership transfer semantics. Experiments using a range of programs suggest that our conservative static analysis method is usually able to infer ownership transfer. Performance results demonstrate that the transformed programs execute up to an order of magnitude faster than the original programs.

show abstract

A Compositional Paradigm of Automating Refactorings

Vakilian

Chen

Moghaddam

et al. 2013

View full text Add to dashboard Cite

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Stas Negara

A Comparative Study of Manual and Automated Refactorings

Is It Dangerous to Use Version Control Histories to Study Source Code Evolution?

Mining fine-grained code changes to detect unknown change patterns

An empirical evaluation and comparison of manual and automated test selection

Automatic MPI to AMPI Program Transformation Using Photran

Inferring ownership transfer for efficient message passing

Inferring ownership transfer for efficient message passing

A Compositional Paradigm of Automating Refactorings

Contact Info

Product

Resources

About