Guillaume Haben scite author profile

Guillaume Haben

4Publications

15Citation Statements Received

158Citation Statements Given

How they've been cited

How they cite others

166

148

Affiliations

University of Luxembourg

Publications

Order By: Most citations

A Replication Study on the Usability of Code Vocabulary in Predicting Flaky Tests

Haben

Habchi

Papadakis

et al. 2021

View full text Add to dashboard Cite

Industrial reports indicate that flaky tests are one of the primary concerns of software testing mainly due to the false signals they provide. To deal with this issue, researchers have developed tools and techniques aiming at (automatically) identifying flaky tests with encouraging results. However, to reach industrial adoption and practice, these techniques need to be replicated and evaluated extensively on multiple datasets, occasions and settings. In view of this, we perform a replication study of a recently proposed method that predicts flaky tests based on their vocabulary. We thus replicate the original study on three different dimensions. First, we replicate the approach on the same subjects as in the original study but using a different evaluation methodology, i.e., we adopt a time-sensitive selection of training and test sets to better reflect the envisioned use case. Second, we consolidate the findings of the initial study by building a new dataset of 837 flaky tests from 9 projects in a different programming language, i.e., Python while the original study was in Java, which comforts the generalisability of the results. Third, we propose an extension to the original approach by experimenting with different features extracted from the Code Under Test. We find that a more robust validation consistently decreases performance on the reported results of the original study, but, fortunately, the model remains capable to decently predict flaky tests. We find re-assuring results that the vocabulary-based models can also be used to predict test flakiness in Python. Finally, we find that the information lying in the Code Under Test has a limited impact on the performance of the vocabulary-based models.

show abstract

A Qualitative Study on the Sources, Impacts, and Mitigation Strategies of Flaky Tests

Habchi

Haben

Papadakis

et al. 2022

View full text Add to dashboard Cite

What Made This Test Flake? Pinpointing Classes Responsible for Test Flakiness

Habchi¹,

Haben²,

Sohn³

et al. 2022

Preprint

View full text Add to dashboard Cite

Flaky tests are defined as tests that manifest nondeterministic behaviour by passing and failing intermittently for the same version of the code. These tests cripple continuous integration with false alerts that waste developers' time and break their trust in regression testing. To mitigate the effects of flakiness, both researchers and industrial experts proposed strategies and tools to detect and isolate flaky tests. However, flaky tests are rarely fixed as developers struggle to localise and understand their causes. Additionally, developers working with large codebases often need to know the sources of nondeterminism to preserve code quality, i.e., avoid introducing technical debt linked with non-deterministic behaviour, and to avoid introducing new flaky tests. To aid with these tasks, we propose re-targeting Fault Localisation techniques to the flaky component localisation problem, i.e., pinpointing program classes that cause the non-deterministic behaviour of flaky tests. In particular, we employ Spectrum-Based Fault Localisation (SBFL), a coverage-based fault localisation technique commonly adopted for its simplicity and effectiveness. We also utilise other data sources, such as change history and static code metrics, to further improve the localisation. Our results show that augmenting SBFL with change and code metrics ranks flaky classes in the top-1 and top-5 suggestions, in 26% and 47% of the cases. Overall, we successfully reduced the average number of classes inspected to locate the first flaky class to 19% of the total number of classes covered by flaky tests. Our results also show that localisation methods are effective in major flakiness categories, such as concurrency and asynchronous waits, indicating their general ability to identify flaky components.

show abstract

What Made This Test Flake? Pinpointing Classes Responsible for Test Flakiness

Habchi

Haben

Sohn

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Guillaume Haben

A Replication Study on the Usability of Code Vocabulary in Predicting Flaky Tests

A Qualitative Study on the Sources, Impacts, and Mitigation Strategies of Flaky Tests

What Made This Test Flake? Pinpointing Classes Responsible for Test Flakiness

What Made This Test Flake? Pinpointing Classes Responsible for Test Flakiness

Contact Info

Product

Resources

About