2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) 2021
DOI: 10.1109/msr52588.2021.00034
|View full text |Cite
|
Sign up to set email alerts
|

A Replication Study on the Usability of Code Vocabulary in Predicting Flaky Tests

Abstract: Industrial reports indicate that flaky tests are one of the primary concerns of software testing mainly due to the false signals they provide. To deal with this issue, researchers have developed tools and techniques aiming at (automatically) identifying flaky tests with encouraging results. However, to reach industrial adoption and practice, these techniques need to be replicated and evaluated extensively on multiple datasets, occasions and settings. In view of this, we perform a replication study of a recentl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 25 publications
(11 citation statements)
references
References 25 publications
0
8
0
Order By: Relevance
“…To ensure the generalisability of our results, it would have been preferable to include more flaky tests in our experiments. Nonetheless, the datasets of flaky tests are generally limited in size due to the elusiveness of flakiness [61], [14], [13]. Moreover, as explained in Section II, the requirements of this study limited the set of candidates considerably.…”
Section: Threats To Validitymentioning
confidence: 99%
See 1 more Smart Citation
“…To ensure the generalisability of our results, it would have been preferable to include more flaky tests in our experiments. Nonetheless, the datasets of flaky tests are generally limited in size due to the elusiveness of flakiness [61], [14], [13]. Moreover, as explained in Section II, the requirements of this study limited the set of candidates considerably.…”
Section: Threats To Validitymentioning
confidence: 99%
“…Given the adverse effects of test flakiness, engineers and researchers aim at developing detection techniques that can predict whether a test is potentially flaky. These approaches rely on a number of runs and re-runs, such as IDFLAKIES [10] and SHAKER [11], coverage analysis like DEFLAKER [12], or static and dynamic test features [13], [14], [15], [16], [17], [18], [19]. Evaluated on open-source projects, these approaches showed promising detection accuracy and considerably decreased the amount of time and resources needed to detect flaky tests.…”
Section: Introductionmentioning
confidence: 99%
“…Following previous studies on flakiness prediction [12,23] finding that Random Forest yields the best performances in flakiness classification tasks, we rely on this model for our classification as well. Selecting the model that yields the best performance is not in the scope of our study.…”
Section: Failure Classifiermentioning
confidence: 99%
“…This line of work has gained a lot of momentum lately as models achieved higher performances. Several works were carried out to replicate those studies and ensure their validity in different contexts [4,12]. More recently, FlakeFlagger [1] has been introduced as another model using an extended set of features retrieved from the code under test and test smells.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation