Marcelo de Almeida Maia scite author profile

Well-designed and publicly available datasets of bugs are an invaluable asset to advance research fields such as fault localization and program repair as they allow directly and fairly comparison between competing techniques and also the replication of experiments. These datasets need to be deeply understood by researchers: the answer for questions like "which bugs can my technique handle?" and "for which bugs is my technique effective?" depends on the comprehension of properties related to bugs and their patches. However, such properties are usually not included in the datasets, and there is still no widely adopted methodology for characterizing bugs and patches. In this work, we deeply study 395 patches of the Defects4J dataset. Quantitative properties (patch size and spreading) were automatically extracted, whereas qualitative ones (repair actions and patterns) were manually extracted using a thematic analysisbased approach. We found that 1) the median size of Defects4J patches is four lines, and almost 30% of the patches contain only addition of lines; 2) 92% of the patches change only one file, and 38% has no spreading at all; 3) the top-3 most applied repair actions are addition of method calls, conditionals, and assignments, occurring in 77% of the patches; and 4) nine repair patterns were found for 95% of the patches, where the most prevalent, appearing in 43% of the patches, is on conditional blocks. These results are useful for researchers to perform advanced analysis on their techniques' results based on Defects4J. Moreover, our set of properties can be used to characterize and compare different bug datasets.

show abstract

BEARS: An Extensible Java Bug Benchmark for Automatic Program Repair Studies

Madeiral

Urli

Maia

et al. 2019

View full text Add to dashboard Cite

Benchmarks of bugs are essential to empirically evaluate automatic program repair tools. In this paper, we present BEARS, a project for collecting and storing bugs into an extensible bug benchmark for automatic repair studies in Java. The collection of bugs relies on commit building state from Continuous Integration (CI) to find potential pairs of buggy and patched program versions from open-source projects hosted on GitHub. Each pair of program versions passes through a pipeline where an attempt of reproducing a bug and its patch is performed. The core step of the reproduction pipeline is the execution of the test suite of the program on both program versions. If a test failure is found in the buggy program version candidate and no test failure is found in its patched program version candidate, a bug and its patch were successfully reproduced. The uniqueness of Bears is the usage of CI (builds) to identify buggy and patched program version candidates, which has been widely adopted in the last years in open-source projects. This approach allows us to collect bugs from a diversity of projects beyond mature projects that use bug tracking systems. Moreover, BEARS was designed to be publicly available and to be easily extensible by the research community through automatic creation of branches with bugs in a given GitHub repository, which can be used for pull requests in the BEARS repository. We present in this paper the approach employed by BEARS, and we deliver the version 1.0 of BEARS, which contains 251 reproducible bugs collected from 72 projects that use the Travis CI and Maven build environment.

show abstract

Ranking crowd knowledge to assist software development

Souza

Campos

Maia

2014

View full text Add to dashboard Cite

StackOverflow.com (SO) is a Question and Answer service oriented to support collaboration among developers in order to help them solving their issues related to software development. In SO, developers post questions related to a programming topic and other members of the site can provide answers to help them. The information available on this type of service is also known as "crowd knowledge" and currently is one important trend in supporting activities related to software development and maintenance.We present an approach that makes use of "crowd knowledge" available in SO to recommend information that can assist developers in their activities. This strategy recommends a ranked list of pairs of questions/answers from SO based on a query (list of terms). The ranking criteria is based on two main aspects: the textual similarity of the pairs with respect to the query (the developer's problem) and the quality of the pairs. Moreover, we developed a classifier to consider only "how-to" posts. We conducted an experiment considering programming problems on three different topics (Swing, Boost and LINQ) widely used by the software development community to evaluate the proposed recommendation strategy. The results have shown that for 77.14% of the assessed activities, at least one recommended pair proved to be useful concerning the target programming problem. Moreover, for all activities, at least one recommended pair had a source code snippet considered reproducible or almost reproducible.

show abstract

Scalable media streaming to interactive users

Rocha

Maia

Cunha

et al. 2005

View full text Add to dashboard Cite

Recently, a number of scalable stream sharing protocols have been proposed with the promise of great reductions in the server and network bandwidth required for delivering popular media content. Although the scalability of these protocols has been evaluated mostly for sequential user accesses, a high degree of interactivity has been observed in the accesses to several real media servers. Moreover, some studies have indicated that user interactivity can severely penalize the scalability of stream sharing protocols.This paper investigates alternative mechanisms for scalable streaming to interactive users. We first identify a set of workload aspects that are determinant to the scalability of classes of streaming protocols. Using real workloads and a new interactive media workload generator, we build a rich set of realistic synthetic workloads. We evaluate Bandwidth Skimming and Patching, two state-of-the-art streaming protocols, covering, with our workloads, a larger region of the design space than previous work. Finally, we propose and evaluate five optimizations to Bandwidth Skimming, the most scalable of the two protocols. Our best optimization reduces the average server bandwidth required for interactive workloads in up to 54%, for unlimited client buffers, and 29%, if buffers are constrained to 25% of media size.

show abstract

A Systematic Literature Review on Bad Smells–5 W's: Which, When, What, Who, Where

Sobrinho

Lucia

Maia

2021

IIEEE Trans. Software Eng.

View full text Add to dashboard Cite

Recommending Comprehensive Solutions for Programming Tasks by Mining Crowd Knowledge

Silva

Roy

Rahman

et al. 2019

View full text Add to dashboard Cite

Developers often search for relevant code examples on the web for their programming tasks. Unfortunately, they face two major problems. First, the search is impaired due to a lexical gap between their query (task description) and the information associated with the solution. Second, the retrieved solution may not be comprehensive, i.e., the code segment might miss a succinct explanation. These problems make the developers browse dozens of documents in order to synthesize an appropriate solution. To address these two problems, we propose CROKAGE (Crowd Knowledge Answer Generator), a tool that takes the description of a programming task (the query) and provides a comprehensive solution for the task. Our solutions contain not only relevant code examples but also their succinct explanations. Our proposed approach expands the task description with relevant API classes from Stack Overflow Q&A threads and then mitigates the lexical gap problems. Furthermore, we perform natural language processing on the top quality answers and then return such programming solutions containing code examples and code explanations unlike earlier studies. We evaluate our approach using 97 programming queries, of which 50% was used for training and 50% was used for testing, and show that it outperforms six baselines including the state-of-art by a statistically significant margin. Furthermore, our evaluation with 29 developers using 24 tasks (queries) confirms the superiority of CROKAGE over the state-of-art tool in terms of relevance of the suggested code examples, benefit of the code explanations and the overall solution quality (code + explanation).

show abstract

Diretrizes para manutenção de múltiplos órgãos no potencial doador adulto falecido: Parte III. Recomendações órgãos específicas

Westphal¹,

Filho²,

Vieira³

et al. 2011

Rev. bras. ter. intensiva

View full text Add to dashboard Cite

Brain death (BD) alters the pathophysiology of patients and may damage the kidneys, the lungs, the heart and the liver. To obtain better quality transplant organs, intensive care physicians in charge of the maintenance of deceased donors should attentively monitor these organs. Careful hemodynamic, ventilatory and bronchial clearance management minimizes the loss of kidneys and lungs. The evaluation of cardiac function and morphology supports the transplant viability assessment of the heart. The monitoring of liver function, the management of the patient's metabolic status and the evaluation of viral serology are fundamental for organ selection by the transplant teams and for the care of the transplant recipient. OBJECTIVEThese guidelines are aimed at contributing to the institutional coordination of organ transplantation and will provide "real world" guidelines that are appropriate in the Brazilian context for the uniform care of the deceased donor. Ultimately, this aim of this guide is to increase the quality and quantity of transplantable organs. METHODOLOGYThe Writing and Planning Committee, comprised of young intensive care physicians and intensive medicine residents, conducted an extensive literature review. From this review, they formulated questions and forwarded the questions to all of the authors of this article. These initial questions served as the starting point for receiving suggestions for the formulation of other questions and definitions.The final questions were revised by the Executive Committee and were returned to the authors to develop the guidelines presented in this article.The questions guided the literature review, which was conducted using the P.I.C.O. methodology where P stands for the target population, I for the intervention, C for the control or comparative group and O for the clinical outcome.The retrieved articles were critically analyzed and categorized according to their grade of recommendation and the strength of the presented evidence in the following manner:

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.